Wednesday, November 25, 2015

Breaking free from X and Y: Why we use D3.js

At Lambda Prime, we often lean on D3.js for creating visualization of processed data regardless of whether the output will be web-based.  This choice may seem obvious given D3's ability to make stunning visualizations. But once you settle down to begin the development of a portfolio of D3-based visuals, it becomes clear that D3 is not designed as the javascript analog to the easy-to-use plotting functionality of Excel, Mathematica, or even libraries like ggplot or matplotlib.  D3 exposes every knob and expects you to set it correctly.  As Uncle Ben says, "with great power comes great responsibility," and as a D3 developer, you are responsible for every aspect of your visualization.

The host of plotting products developed on top of D3.js provide a perfect illustration of this trade-off between ease-of-implementation and flexibility in crafting output. We've reached the point where D3-based charts can be created with a wizard in Excel...that is, if you're ok sticking with the usual complement of two-axis charts.  If you eschew these reusable chart libraries and prefer diving into javascript just to plot an X-Y scatter or a bar chart, well, all the more power to you.

Libraries like dimple.js provide an easy way to create D3.js-based charts, so long as the simplicity of your dataset matches the simplicity of the tool.

What all of these D3-based products omit, however, is the ability to visualize disparate data types at will. You simply can't fire up NVD3.js or dimple.js and turn your Excel data into, for example, a network graph overlaid on a map of the United States.

D3.js enables visualizations that can include several types of data. In this case, geo-coded data, pairwise relationships, and extensive data are displayed simultaneously.

And so that's why we spend our time developing in D3. It allows us to break out of the X-Y mold of data visualization and layer on additional details that conventional charts omit  There's no wizard for it, but the investment in one-of-a-kind visualizations enables analysis that would be impossible if limited to just X and Y.