How do you know the difference?
Data is the hot new thing, and as such it has spawned a bunch of new terms and jargon, which can be pretty hard to keep track of. To help you sound like a data guru instead of a data noob, I’ll be taking you through some of the terms people tend to get a bit confused about.
One of the most common phrases I hear being used incorrectly is Data Mining. There is a very important distinction between Data Mining and Data Collection. I know they sound like they’d be the same thing, but they’re actually very different.
So…What is Data Mining
Data Mining refers to the software and computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, Predictive analytics, and database systems.
That’s a fancy way of saying data mining shows you the important patterns inside your existing dataset. As we all know, data is only as useful as the conclusions we can draw from it. Data mining is simply the process of “mining” that meaning from what would otherwise be an unintelligible spreadsheet.
For example, you might use a data mining program to analyze the buying patterns of your customers and discover that men who bought diapers between Thursday and Saturday were also likely to buy beer.
Not to be confused with…Data Collection
Data Collection, unlike data mining, is exactly what it sounds like: the process of gathering and measuring information usually with software. There are loads of different data collection techniques and procedures, but when you’re talking about it in terms of Big Data (which most buzzword lovers are) they usually mean electronic (or online) data collection.
That’s what we do at import.io! We provide a Web Data Collection tool to help people gather data from the websites. The mining of that data is up to you :-)!
Data Mining – analysing data to find useful patterns
Data Collection – the process of gathering large amounts of data (often from the web)