In my job at Silk.co, I help lots of journalists, workers at NGOs and marketers build data visualizations from spreadsheets. Often we use Import.io to extract data in from the public Internet to push into a spreadsheet which we then upload into Silk for further analysis and visualization. (Here’s one we did with Import.io about Uber Jobs which was picked up in Mashable). I do spend considerable time thinking how to best represent data with visualizations. I am by no means an expert in data visualization on the level of Alberto Cairo or Edward Tufte.
That said, I do have some basic visualization guidelines that I use. These guidelines enable anyone quickly match the goal of their data visualization to the visualization type (or types) that should work best for their data.
Here’s a short version of the Silk Data Visualization Guidelines:
- Use lists and tables to show simple ranking of data;
- Use maps for location-specific data;
- Use scatter plots to show the relationship between two numbers;
- Use donut and pie charts for showing proportions and distributions;
- Use vertical column charts to compare a few items;
- Use horizontal bar charts to compare many items;
- Use stack charts to show relative components as part of a whole;
- Use grids and mosaics to show images
- Use groups to organize items sharing a characteristic or property.
That’s the simple version. Here’s a longer one.
Lists and Tables
These are particularly good if you want to show multiple columns of data in a simple format and if you want viewers to be able to sort the data quickly and establish rankings based on specific attributes. For data such as salary databases, real estate property listings, or tables of statistics of professional athletes, a table is a safe bet. A wide variety of tools can handle tables very well. After all, tables are the workhorse of the data visualization universe.
This is a table of AngelList Syndicate Leaders.
Obviously locations are best shown on maps. Maps can also be a clear way to show numerical values grouped by location (a number plot or a bubble plot). For example, to show which neighborhoods in San Francisco or London have the highest per-square-foot rents, a number plot map is often a good choice. A similar map visualization is an area map. Area maps are colored to show data values. For example, a map of the United States might show purple, blue and red states to demonstrate political affiliation. Closely related to the area map is the heat map, which uses color gradients to show data values. Heat maps work best when there are many points on a map, each with a specific number value. Another nice way to visualize data with maps is to assign different colors to map location pins based on specific data attributes. For example, in map of restaurants in a city center, all Italian restaurants might have red maps pins while all Chinese restaurants might have blue map pins.
Here’s a map of food poisoning outbreaks in LA.
These charts show one set of numerical values on the X-axis and another on the Y-axis. Scatter plots can highlight outliers and relationships between types of data. For example, data on kindergarten vaccination rates in California can be placed on one axis while school size can be placed on the other in order to see if smaller schools are more likely to have lower vaccination rates. (This is, by the way, public data put out by the California Department of Public Health). However, scatter plots are not good at all for ranking factors of comparing more than two data attributes. That’s why you may need to flip through multiple scatter plots to find clear relationships, outliers and data stories.
Here’s an example scatter plot points given and received by country in the latest Eurovision song contest.
Donut and Pie Charts
Lots of people in the data visualization world hate pie charts. We don’t feel quite as vehemently, although we do like them better for displaying data about proportion or distribution of a smaller number of data values. Donut charts can handle a few more data values because they have a bit more space (the hole in the middle, you see). In general, you never want to show more than 10 slices of data in one of these charts because the smaller slices quickly become indecipherable. That’s particularly true on a smart phone. On donut and pie charts, too, the legend can quickly become unwieldy and nearly as tall as the chart.
This is a donut chart showing the popularity of sports by country.
Vertical, Stack and Horizontal Column Charts
Flat out, horizontal charts look better on mobile. They also do a much better job showing a larger amount of data. For example, a horizontal column chart showing the 20 most expensive real estate markets in the U.S. would fairly easily handle that load and the horizontal direction of the labels inside or under the columns would scale up and down very nicely. In contrast, a vertical column chart would end up with an overly busy axis (which is why remove the axis labels in responsive designs) and the columns might feel more tightly packed. Horizontal stack charts display with more data than horizontal column charts because the different colors of the stack elements make it easier to see the different data values in each stack. We’re big fans of colorful stack charts like the one in this article by the San Francisco Chronicle about the resegregation of San Francisco Schools.
Showing elements organized by like values, Groups are not always considered a data visualization. But grouping elements based on numbers or data attributes is a very clean way to visually compare elements. For example, in a Silk about the highest paid athletes, a group view might organize the athlete datacards by sport.
Like tables, groups area simple but powerful data visualization.
Keep It Simple, Silly:
Our final basic rule for data visualizations is less is more. Pick a visualization that cleanly and powerfully tells a story from your data and let that visualization speak for itself. Putting more than one or two data visualizations on a page that already has loads of text, images and videos makes your content busy and makes your data harder to understand. In online environments where this is less noise and are no advertisements, adding more visualizations can work in some circumstances. But the rule of 1-2 visualizations per page is a simple way to make sure your readers can easily understand your data.
Hopefully this quick primer gives you a basic guide to visualizations to help you think about the best ways to represent data you have in a spreadsheet. In subsequent posts, we’ll talk about the best visualizations for viewing on mobile devices and how to think about labeling your data in the spreadsheet or in your visualization tool in order to maximize legibility and help readers quickly see the data story you are telling. Thanks for reading and feel free to hit me with comments at @alexsalkever.
Alex Salkever is head of product marketing at Silk.co. He is also a data geek and former journalist who served as technology editor of BusinessWeek.com. He blogs regularly about data topics on the Silk blog.