Import.io Extract

Get web data with no code

Using highly sophisticated machine-learning algorithms, we try and extract the data you want automatically. If we don’t get it right – no need to worry – use the point and click interface to show us what you want and we’ll go get it for you.

Deal with interactive websites

With page interaction, you can interact directly with the website while recording every click, scroll, and hover. For data that is spread over multiple pages, we’ve also got you covered with advanced yet simple-to-use features such as automatic pagination.

Get data from hierarchical sites, like a product list and detail pages

With page linking, a list page can be chained with detail pages from each item on that list. For instance, the top level list of Mexican restaurants in San Jose on Yelp has some data, but when you click on each restaurant you get more data about that restaurant. Chaining allows you to pull all of the detail pages at the same time.

Get data from millions of pages in a single run

One page of data is good, thousands of pages are better, millions are amazing. Use our URL generator to use common patterns to generate thousands of URLs in seconds or use our smart URL detector to discover similar URLs.  Save them and run one job on millions of pages.

Easy scheduling

Get exactly the data that you want, exactly when you want it. Schedule away with just a few clicks and let us do the rest. Almost any combination of time, day, week, and months are supported. You can also set up email alerts for when your data is ready.

Get any data - including images and files

Download images and documents along with all the web data in one run. Marketing teams use it to download product images from manufacturing sites when building and updating ecommerce sites. Programmers pull millions of images for AI training. Academic users download hundreds of documents at once from multiple sites for research. Manufacturers can capture a PDF or PNG image to monitor compliance with minimum advertised price.

Get data from behind a login

Authenticated extraction allows you to get data that is only available after logging into a website. You provide the appropriate credentials and Import.io will do the rest.

Save screen captures

One more way to ensure compliance and accuracy is to use Import.io screen capture. Just turn it on and run your extraction. A screenshot of each web page will be recorded and stored.