One of the refreshing things about import.io is the service is accessible to users with a wide range of expertise from seasoned coders to those with no coding experience at all. From point and click data extraction, to basic integration using Google Sheets, within minutes you can be creating custom workflows and automations. One of the nice things about getting the data in a spreadsheet, is that users can quickly and easily graph, filter or even manipulate their data further with a selection of formula. There are, however, issues with spreadsheets.
In ‘Spreadsheet Addiction’ Patrick Burns highlights the ambiguity of value and formula. Because a cell can be both a formula and a value it becomes a powerful way for computing data and, in most cases, immediately seeing an affect. The issue is that the cell ends up having a double meaning, meaning we can see the result, but without exploration we don’t know if it is a pure value or dependant on other variables.
A common scenario is when you are sorting data. If this data includes calculated cells that have a dependency on other data relative to it sorting or rearranging the relative formula can mean you end up with junk. For example, when you use the importHTML formula highlighted in Andrew Fogg’s recent ‘Using Import.Io And Google Sheets’ session the formula lives in a single cell populating the cells it needs for the rows and columns of data. If we try and manipulate this data with a sort, the source cell with the formula in it could be moved which then rewrites the data relative to it:
Fortunately for ‘Sheet junkies’ the the way around this is to copy and paste all the cell values in the sheet. You could do this manually, but this may end up being a chore if the source data is always updating. Another way to do this would be using one of the import.io client libraries with the Google Spreadsheets API. One of the issues with this approach, apart from all the setup, is that the code to do it has to live on the server somewhere.
There is, however, a third way which requires less programming skills and importantly no self-hosting of code. The solution is Google Apps Script, a free cloud based programming language which lives in Google Drive and comes with Google API integration built in. One of the joys of Google Apps Script is that in a couple of lines of code we can read and write data to a Sheet, Drive or anywhere else where you can get access to it.
Lets start with a very basic example. Let say I want to email one of my import.io datasets to myself once a day. Assuming we already have our dataset in import.io next we need to create our Google Apps Script. Here is an overview of Google Apps Script with instructions to creating your first script.
Assuming you’ve been able to create your first script; next, back in your import.io dataset, click on the Integrate button and select Google Sheets from the side panel. Enter your import.io password to get your API key and import.io generates the importHTML formula for you. The bit we are interested in is the query url highlighted below – all we are doing here is using the integrate wizard to build a url for import.io’s REST Query API:
In our Google Apps Script window replace myFunction with the code in the gist replacing the YOUR_IMPORTHTML_URL with your url from above and changing the firstname.lastname@example.org to any email address you want (preferably your own for this test):
After you save your script you can Run > getImportio. As this is our first run, App Script will automatically prompt you for permission to run the script. If all has gone to plan you should now have a copy of your import.io data set in your inbox.
So how do we get this data once a day? In the script editor window if you click on Resources > Current project’s triggers ‘No triggers set up. Click here to add one now.’ This gives us the option to run our function as a time-driven event once a day:
Back to the addiction
A quick recap then. Having setup our data set on import.io in four lines of code we can fetch this using a timed trigger and email. This is just the beginning as there is a long list of Google products like Drive, Maps, Adsense and more we can hook straight into. But what about our addiction? To finish, here’s another quick Google Apps Script example to write write our import.io dataset to a sheet. This time start with a blank Google Sheet. As Google Apps Script can be standalone or associated with a Sheet, Form, Document or Site our spreadsheet has a Tools > Script editor option. Open this and replace the myFunction with the code in this gist using the same Url you used before but changing format=HTML to format=JSON.
Saving your code and then running it will again prompt an authorisation window. Once it’s finished, if you look at your spreadsheet your data should be written in the cells.
Again, this is just the tip of the iceberg. There are many more easy interactions we can code in; like inserting new rows each time we fetch the data, creating a new sheet or even creating an entirely new spreadsheet.
Martin Hawksey, Chief Innovation, Community & Technology Officer at Association for Learning Technology
There are some health warnings to be aware of when using Google Apps Script. Scripts have a maximum execution of 5 minutes. This is plenty for most applications, but unlike the examples shown here, which have purposefully been kept brief, it’s recommended to handle exceptions and errors. There are also quotas for for a number of the built in App Script services detailed in the quotas tab here. If you find you get stuck with Apps Script there is a great community ready to help out including Stack Overflow and these Google+ Communities.
Side note: We loved Martin’s integration idea so much, we turned it into a Client Library!
Turn the web into data for free
Create your own datasets in minutes, no coding required
Powerful data extraction platform
Point and click interface
Export your data in any format
Unlimited queries and APIs