Selecting data from a webpage with the point-and-click interface
This topic describes how to select data from a webpage to train your extractor to retrieve the data you desire. Training happens in webpage view of the editor using the Import.io point-and-click interface.
To train your extractor to capture the kinds of data you desire, perform the following steps:
In webpage view in the editor, add a new column. (Alternatively, select a column with incomplete data by clicking a column name in the column headings bar.) The data column appears in the floating data column window, displaying any previously-identified data.
Move your mouse pointer around the webpage and notice the pointer changes to an arrow with a green + icon and a dynamically-changing pink outline box appears around the data under the pointer.
Outline exactly the piece of information you desire to capture, then click. The thin pink outline changes to a thicker green outline and the data appears as a new row in the floating window.
Note: When you select data that contains a link, the Question? dialog box appears:
Click Yes to include the URL associated with the link text as part of your data set. Link text appears blue in the floating window. While the data appears in one column in the editor, when your extractor runs, the extractor stores the data in separate columns in your CSV or as a separate datapoint in the JSON response.
To include multiple items in the same column, repeat the point-and-click process. The thin pink outline changes to a thicker green outline and the data adds to the column in the following ways:
- For multi-item pages, a new row of data appears in the floating window.
- For single-item pages, Import.io adds the data to the same row in the column (separated by commas) and a +n items icon appears next to the first data point you selected in the floating window.
Note: If Import.io adds multiple data rows for your single-item data, use the single row option to force the data into a single row.
At some point in the selection process, the editor catches on to what you are adding and makes additional selections for you.
When the editor’s analysis is incorrect, simply unselect the undesired data points. Move your mouse pointer around the webpage and notice the pointer changes to an arrow with a red – icon when hovering over a selected data point. To remove an undesired data point, simply point and click.
Note: To unselect all the data in the column, click Clear data at the top of the floating window.
Note: If you click on something by mistake, undo and redo are also available.
Repeat the process for each column with data points you want to collect.