You can download the data saved for Crawlers and Datasets programmatically using the API. You can see this process in full by watching our webinar video, but here is a step-by-step example complete with API calls.
For this example I will use this Amazon crawler I built during the webinar.
In order to get the data over the API, you need the data source’s GUID, which is available on your My Data page. In this case it is “83b8ad35-5a80-4889-80ec-d5718627e77e”.
Next, you need to get the data that you see on this web page over the API. To do this, simply add that GUID to the end of this URL (http://api.import.io/store/connector) and paste it into your browser.
For all of these examples, you will need to authenticate. In order to do this, get your User GUID and API key from your account page. Then, URL-encode your API key. Finally, append them to the example URLs in this format:
My URL to my data source looks like this: https://api.import.io/store/connector/83b8ad35-5a80-4889-80ec-d5718627e77e (don’t forget to include your authentication! See above)
This is the data that comes back from the API:
From the response, check the “snapshot” field. The snapshot GUID identifies the current version of the data that has been saved for the Crawler or Dataset. When you find that GUID, add it to your first URL with /_attachment/snapshot/GUID.
This URL will provide you with all of the Crawler or Dataset’s data as a JSON file.
Here is an example of how that data looks:
Whenever you update the data for a Crawler or Dataset (by saving them), don’t forget to make both requests, as the second GUID will change each time it’s saved.
You can see a list of all of the GUIDs for different versions by using the history API, with a URL that looks like this:
Here is an example response from this request:
Each of the “_id” fields in the “hits” array is a GUID that you can use as the second argument, which downloads that version of the Crawler or Dataset’s data.
Turn the web into data for free
Create your own datasets in minutes, no coding required
Powerful data extraction platform
Point and click interface
Export your data in any format
Unlimited queries and APIs