Data professionals spend 73% of their time getting and preparing their data and just 27% on actually gaining insights according to a 2018 IDC Business Analytic Solutions survey. Import.io has already slashed the time it takes to get the data and is now decreasing the time to prepare and update the data.
Import.io is reducing the time-to-insight with Import.io transform. This new capability allows you to clean, prepare, and wrangle the web data you have extracted by using over 100 Excel-like functions and formulas. Rather than downloading data to an Excel spreadsheet or using a 3rd party package and trying to keep it up-to-date when a website changes, Import.io does the work for you. Once a transformation is created, then every time your scheduled data extraction runs, the transformation is also applied, leaving you with clean, ready-to-use data for your AI or Machine Learning project or for reporting. All of the data, both original and transformed, is stored in Import.io’s cloud based service available for you to perform further insights.
Import.io transform uses Excel-like formulas to interpolate new data. For instance, you can:
- Add numbers or combine words from multiple columns
- Determine keyword frequency in a description block of text
- Fill in data gaps by using information in another column
- Ensure data consistency
- Remove superfluous words from a column for a clean set of data
- Ensure data from hundreds of different sites is in a common data schema
- And so much more…
Extracting and structuring data from a website gets you part of the way, but data transform gives you the flexibly to get the right data for your business. For example, you can add a column of prices and a column of shipping costs to arrive at the total cost for a product in a new column.
Or, you can make a column of just some of the data in another column such as taking out the excess words in a rating and turning “3.8 stars out of 5” to “3.8”. (see example)
When websites have multiple ways of representing the same information, such as 10K or 10,000, Import.io data transform allows you to set up rules that automatically and consistently format the data every time your extractor runs.
If the website data has gaps, you can use a logic formula to fill in the gaps, for instance populating the country column based on state, city, or zip code.
Once you pick what formula or logic you want to use, the entire column is calculated for you. If the data changes on a website the next time you extract it, your transformed data also updates to reflect these changes. This simplifies and automates data manipulation – set it up once, store it in the portal, and Import.io keeps the data up to date.
Import.io transform ensures highly accurate, up-to-date, and more complete data – further reducing the time-to-insight.