In a article for TechTarget’s SearchITChannel, Moshe Kranc, chief technology officer at Ness, describes how custom and commercial offerings, bolstered by machine learning, can now facilitate web data extraction. He offers Import.io a commercial solution. Import.io’s point-and-click interface enables the user to teach the system how to extract data for a given website, enhanced with machine learning to infer extraction patterns for new sites based on knowledge learned from other sites.
Kranc goes on to establish the best guidelines for web data extraction tools, including web crawling, using proxies, and learning from user actions over time.
Click here for the complete article.