APIs are all the rage these days and with good reason. Gone are the days when people used desktop applications, either web-based or native, to interact with data that lived on proprietary backends. These days, more often than not, the primary computing device will be a cell phone or tablet. The apps on the mobile devices might be browser based or native, with many consuming data from no single source. Data is coming from everywhere, literally. How are applications accessing this landscape of distributed data sources? Typically via an API.
However, there is still a lot of data out there on web pages that’s not accessible via API (i.e., the website operator doesn’t offer one), including data that’s of interest to developers, business analysts, report writers, and other parties. And even in cases where the operator of the site offers an API, the skillset required to access it requires a good degree of programming know-how that’s not typical for most business folks. Yet, they still need that data. What’s to be done?
Getting structured data out of web pages — often referred to as “web scraping” — is a real need, particularly for people whose job it is to prepare and analyze the information that’s presently available in web pages. Meeting this need is right up the alley of a data extraction tool, such as Import.io.
Import.io allows you to ingest data in a web page and convert it into structured data that you can use in a spreadsheet or express in JSON.