example: extract pricing data from any web page

This article was originally written in Italian for: VIETATO STACCARE IL CERVELLO

by Irene Scapin 🙂@iome_e_irene

After reading the article All the best big data tools and how to use them, I tried downloading to extract pricing data on a web page so that it can be processed later. By subscribing to the trial version of the program, you can download 500 URLs.

At first, it was a little daunting what to do, but I quickly found the right order of commands and I appreciated the usability. The potential of the service is enormous.

I used to monitor competitor pricing. For my use case, saved hours spent recording competitors’ prices in the surveys I usually make from the e-commerce sites of other brands. The program exports web page data, such as the list of offers presented after clicking on an e-commence website “submit” button.  In just a few minutes, the data can be extracted for other purposes.  

Using, I created a new extractor by clicking “New Extractor” and returning the link to the category page as shown below.

Warning: If the product number is limited on the page, it is useful to open the page first, maximize the number of visible products, and copy the URL to create a new extractor.

data extraction

After the page loads, you will see that automatically schematizes the page and tabulates the different column attributes as in this case blow. page automatic table of data:

table of data

If you don’t need all of this information, but just want to extract pricing data for example, click on “Start over with empty table” and select the data and columns you would like to extract. select attribute shown in red dotted box:

select data

By double-clicking on the dotted box where price is written you can customize the name of the attribute. To select the attribute, click on the page. For example, I wanted to select as the attribute the price. First I clicked the $84 price and as a result the box turns green. Clicking the second price ($249) trains the extractor to automatically select all the prices on the page, so the price is ready to be extracted to a tabular spreadsheet. extracted files:

extracted data

As a result, the extracted data will appear, of course, by setting all the extractions in the same way it will be easier to compare the various files. can also schedule these extractions with a cadence of your choice. schedule extraction:

schedule extractor

Thanks for reading, hope you enjoyed! If you would rather read the Italian version, please click here VIETATO STACCARE IL CERVELLO.

Extract data from almost any website