I’ll be the first to admit that data sets aren’t exactly the most exciting things to look at. They’re great for running analysis on, but spotting trends and patterns can be pretty difficult when everything is just lined up in rows and columns. Which is why data visualization tools are so important! Now, we’re not trying to reinvent the wheel over here at import (we’ll probably never make a viz tool), but we realize that getting the data is only half the story. So, we’ve set up a direct line to the guys over at Plot.ly (a web-based graphing platform), to let you send your data straight into an awesome viz. In this webinar, I’m going to give you an overview of how it works…
Extract LOTS of Data
Now, one of the reasons we chose Plot.ly for our first integration is that they are the perfect platform for vizzing with lots and lots of data (and we’re good at pulling lots of data). To illustrate this point, I’m going to pull a load of historical data (going back to 1946) on oil prices.
The first thing I always try is Magic, ‘cause it’s super quick and all I have to do is paste the URL into it and click “Get Data”. In this particular case, Magic is picking up information that I don’t actually want. This happens sometimes and largely depends on the layout of the page. Unfortunately, I can’t edit the data Magic extracts (yet) – I could download it as a CSV, do some data clean up and then upload it to Plot.ly – but then I wouldn’t be able to show you the awesome integration!
Magic extracting extra (unwanted) data
Anyway, if Magic doesn’t work exactly the way you want, you just need to download our desktop app (don’t worry, it’s still free) and do a little bit more manual training. Because this data is displayed on a single page, I’m going to use the Extractor (New) tool. Generally, you just need to click on one (sometimes two) examples, and our algorithms will figure out the rest. But, once again, you can see it’s picked up data that I don’t want.
To fix this, I simply turn on Manual Row Training by clicking the button and clear the existing training. Then, just like in the classic workflow, I highlight one full row of data by clicking and dragging.
This time when it picks up the extra rows, I click on the wrench (spanner) to access the advanced settings. At the top you’ll notice that there is the word “Skip” with a box next to it. This will let you skip as many rows from the top as you want by typing the number (in this case 3) into the box and hitting ENTER.
Then you can exit the manual row training (you’ll see it’s skipped the first 3 rows) and continue training our columns. Remember when you’re training your data to change the data type to reflect the data you’re extracting. For example, I want the “Nominal Price” to be mapped at MONEY and not TEXT.
When I’m finished, I simply click “Done”, name my data set and click “Publish” – which will create an API! Then you’ll be taken to your newly created data set on the My Data page…
Graph that data!
You’ll notice on the My Data page there is now another tab in your data set (next to Google Sheets) called Plot.ly. Click this tab and then select the tick box next to the columns you want to export (in this case I want all of them).
Once you’ve done that, all you need to do is hit “Export to Plot.ly” and you’ll be taken to the Plot.ly interface in a new tab with all your data loaded in automatically. Everything else will now be done in Plot.ly.
Note: you’ll need to create a free Plot.ly account
For this chart I’m going to make the year the X axis and Nominal and Inflation Adjusted Price on the Y-axis. Then I just click “Line Plot” and Plot.ly does the rest, to give me this:
I can now give my chart a title, change the sizes, fonts, colors, etc. I’m still exploring all the fun things that can be done in Plot.ly (I’m by no means an expert – yet), but if you have any questions about how to use Plot.ly or want to learn more about their features; you can head over to their knowledge base.
Next week I’ll be joined by Alex Salkever from Silk.co. Together we’ll show you yet another great tool for visualizing your data.