Blog

Give Back discounts: Another way Import.io is democratizing web data

Written by:

Import.io makes it easy to work with web data, and now we’re making it affordable too. Import.io is committed to giving back to the wider community. We are firm believers in the principle that data gained from the web can fuel powerful insights and important results. That’s why we have decided to offer our new […]

7 ways Import.io just made data extraction easier for you

Written by:

Today we are pleased to announce the November release of Import.io. This release is focused mainly on a newly updated version of our extraction platform. The data extraction platform is the brains that runs all of the web pages through your Extractors and helps you convert web pages into data. The way that you build […]

What you need to know to make Deep Learning work for you

Written by:

Artificial Intelligence made promises that it couldn’t keep for over five decades.  But that has all changed with the advent of Deep Learning, which has been delivering staggering results in the past few years. The field of Deep Learning is accelerating as researchers try bolder and bolder experiments. Some of the applications of Deep Learning, […]

Machine learning dataset for musical training

Written by:

At the beginning of October, myself and my partner Aida, released a Twitter bot – LnH AI: The Band. This hobby project of ours is a music bot capable of composing music on-demand, based on tweets that users send to it. It is powered by special Deep Learning models that we have developed over the […]

Artificial Intelligence Regulation

Artificial Intelligence Regulation: Let’s not regulate mathematics!

Written by:

On Wednesday, ahead of today’s White House Frontiers Conference, the White House Office of Science and Technology Policy released its report on Preparing for the Future of Artificial Intelligence. The report is optimistic, comprehensive and well-balanced. In summary: full-speed ahead.  But let’s be smart when it comes to Artificial Intelligence regulation. The premise is that […]

what is artificial intelligence

What is Artificial Intelligence? Louis Monier explains everything.

Written by:

What is Artificial Intelligence? Our Chief Scientist Louis Monier gives you the straight dope on AI. Artificial Intelligence, always a very polarizing subject, is back on top of the news. Unless you have been on a deep space mission for the past year, you have been exposed to opinions ranging from “this will change everything […]

Can data shape the future of mental health support?

Written by:

What follows is a reproduction of an article from The Guardian newspaper.  It discusses Plexus, an app developer that is using Import.io to acquire mental health data from the web for their chatbot named Grace.     Can data shape the future of mental health support? If you’re experiencing a mental health issue, one of the people you probably least […]

5 New Advanced Data Extraction Features To Try Out

Written by:

You asked for more powerful data extraction features. We took your feedback and set our Engineers loose on the challenge. We are excited to announce 5 brand new advanced data extraction features that will help you get data out of more websites: Disable CSS Default Column Values Advanced Regex Support Require Column Values Raw HTML […]

Our biggest update ever!

Written by:

On the 2nd of April, we launched a new version of Import.io with many enhanced capabilities. The highlights include: 100% web based experience including an all-new web extractor Scheduler – schedule your extraction runs right from our UI Data Storage – allow us to store your data for you Upgraded JavaScript support Highly scalable backend […]

Query API – Service Availability

Written by:

Last updated: 2pm PST, Wednesday 9th March 2016 The new Query API has been in production for 24 hours now at 99.9% availability.   We will continue to monitor this situation but you can assume that normal service has been resumed.  Any questions please reach out to supprt@import.io First published: 9am PST, Tuesday 8th March 2016 The Import.io […]

Great alternatives to every feature you’ll miss from kimono labs

Written by:

If you’re one of the 125k KimonoLabs users who got this message last week… “After almost 2 years of building and growing kimono, we couldn’t be happier to announce that the kimono team is joining Palantir.” …you’re probably wondering what to do next. There’s no denying that kimono was a useful service with some great […]

5 undeniable reasons to invest in data before it’s too late

Written by:

A Gartner survey revealed that “more than 75% of companies are investing or planning to invest in big data in the next two years.” Companies see the value in data, including enhanced customer experiences, streamlined processes, and targeted marketing. It’s a high-return investment that keeps businesses competitive in their sectors. However, executives and managers get […]

The UK startup scene is booming

Written by:

Today we are pumped to announce that we are joining 29 other amazing UK startups as part of Tech City UK’s Upscale program! The Upscale program will power 30 of the fastest growing UK tech companies on their scaling journey by connecting them with Scale Coaches who have a proven track record of growing successful […]

How today’s top Marketers and Growth Hackers use data every day

Written by:

Marketing data is an essential part of any good marketing strategy. While the process of collecting, storing and analyzing data may be complex (and require an advanced math degree), understanding and harnessing the power of that data doesn’t have to be. Wondering how data help make your organization’s marketing better, stronger and more effective? We asked […]

Neural nets: How Regular Expressions brought about Deep Learning

Written by:

2016 is the year of Deep Learning, whose popularity has been on a steep incline ever since Google bought DeepMind at the end of 2014. Last year’s technical breakthroughs,  acquisitions, funding deals and, open source releases have all helped to cement Deep Learning as the hip artificial intelligence.  Our CTO, Matt Painter, explains how Deep Learning got its […]

Import.io’s $13 million Series A

Written by:

We are pleased to announce that Import.io (that’s us btw) has just raised a $13 million Series A round of funding. The new investment was led by Imperial Innovations, with participation from Wellington Partners, Oxford Capital, Open Ocean, Delin Capital and AME Cloud Ventures. “This new round of funding represents the biggest investment ever in web […]

Trump is winning the media mention war

Written by:

Whether or not you’re a fan of “The Donald”, you can’t deny that he is everywhere! Trump is dominating both the left and right wing arms of the mainstream media at the moment with more mentions than any other presidential nominee! According to journalism.org… “Overall, when respondents are asked what outlet they turn to most […]

Anthony Goldbloom gives you the secret to winning Kaggle competitions

Written by:

Kaggle has become the premier Data Science competition where the best and the brightest turn out in droves – Kaggle has more than 400,000 users – to try and claim the glory.  With so many Data Scientists vying to win each competition (around 100,000 entries/month), prospective entrants can use all the tips they can get. […]

10,000 leads in 10 minutes

Written by:

Ok, let’s start with the basics. Leads represent the first stage of the sales process.  In its simplest form a lead is any “person or entity that has an interest and authority to purchase your product or service”. Or in other words: someone you can sell to.

Sounds good. So, what information do you need about that person or entity for it to be an actionable lead? A good rule of thumb is to look for the information that you would find on a business card, i.e. their name, associated company and contact details.

10 keynote talks on data and business that you need to watch

Written by:

In recent years more than ever before, if you’re not paying attention to the data, you’re missing out on opportunities that can help you grow your business. Data is everywhere. The businesses that will survive and thrive over the next few decades (and long into the future) are the ones that are committed to capturing, […]

deep learning

Andrew Ng shares the astonishing ways deep learning is changing the world

Written by:

Just when you thought you’d got your head around the whole Machine Learning thing…BAMN! There’s a new tech buzzword in town rearing up to take it’s place.

Deep learning.

And while it may seem like just another Silicon Valley buzzword that all the new startups will claim to be using, deep learning is actually already being used to make some really astounding advances. We’re talking borderline science fiction here.

We caught up with deep learning expert, Andrew Ng, and asked him to explain what deep learning is and how we should expect to see it change the world in 2016.

20 questions to detect fake data scientists

Written by:

Now that the Data Scientist is officially the sexiest job of the 21st century, everyone wants a piece of the pie.

That means there are a few data posers out there. People who call themselves Data Scientists, but who don’t actually have the right skill set.

This isn’t always done out of a desire to decieve. The newness of data science and lack of a widely understood job description means that many people may think they are data scientists purely becuase they deal with data. 

How to buy data for your business

Written by:

Like many businesses today, you know you need external data to drive insights.

What you may not know is how to get your hands on that data. You would think in this day and age, like pretty much everything else, you could just buy what you want. And you can. Kind of.  

In this guide, we put ourselves in your shoes. The shoes of someone who needs to get external web data into their company. And we answer the all important question:

How do I get external data into my business?

Airbnb’s Impact on San Francisco – How we helped the SF Chronicle get the data behind this news story

Written by:

Everyone who lives in a busy metropolis like San Francisco has had to deal with rising house prices and decreasing availability. Business insider reported that the median price for a 1-bedroom in San Fran was $3,100/month – officially making it more expensive than New York.

But did you know that rental services like Airbnb, Homeaway and Flipkey could be partly to blame? In a recent article, the San Francisco Chronicle analysed the latest data from these sites to determine just how bad these services are for the real estate market.

22 data experts share their predictions for 2016

Written by:

Predicting the future is never an easy task. But as 2015 winds down, we can’t help but look forward to what the new year will bring. 

Will you finally be able to buy a self-driving car? Will machines become smarter than man? And what, will happen to the world of data science?

We’re no fortune tellers, so we rounded up a bunch of experts to ask them what they thought. And here (in no particular order), is what they said: 

3 easy ways to get your data into R

Written by:

If you haven’t heard of R before, you should know that it’s one of the most popular statistical programming languages in the world, used by millions of people. It’s open source nature fosters a great community which helps make data analysis accessible to everyone. If you want a better understanding of how R works, and its syntax, we recommend you to take this free Introduction to R tutorial by DataCamp.

While import.io gives you access to millions of data points, R gives you the means to perform powerful analysis on that data and to turn it into beautiful visualizations. It’s a pretty nifty combo!

In this post, you’ll learn 3 easy ways to get your import.io data into R. This is a beginner tutorial so don’t worry if you’re not that familiar with R or import.io’s advanced features.

Let’s get started!

From masters to Microsoft: 7 charts that plot the path of today’s data scientists

Written by:

There are currently 11,400 data scientists worldwide and 52% of them earned that title within the past 4 years. Where are they coming from? Where are they working? What did they do to get here?

Based on a massive study done by RJMetrics that analyzed 360 million LinkedIn profiles, we were able to answer these questions and more in order to paint a picture of the state of data science today.

In this post, we’ll look at 7 charts that determine the education level of data scientists, their areas of study, where geographically they work, and which companies they work for.

How to crawl a website the right way

Written by:

The word “crawling” has become synonymous with any way of getting data from the web programmatically. But true crawling is actually a very specific method of finding URLs, and the term has become somewhat confusing.

Before we go into too much detail, let me just say that this post assumes that the reason you want to crawl a website is to get data from it and that you are not technical enough to code your own crawler from scratch (or you’re looking for a better way). If one (or both) of those things are true, then read on friend!

In order to get data from a website programmatically, you need a program that can take a URL as an input, read through the underlying code and extract the data into either a spreadsheet, JSON feed or other structured data format you can use. These programs – which can be written in almost any language – are generally referred to as web scrapers, but we prefer to call them Extractors (it just sounds friendlier).

A crawler, on the other hand, is one way of generating a list of URLs you then feed through your Extractor. But, they’re not always the best way.

38 great resources for learning data mining concepts and techniques

Written by:

In the blossoming world of Big Data, the data miner is king. 

With today’s tools, anyone can collect data from almost anywhere, but not everyone can pull the important nuggets out of that data. Whacking your data into Tableau is an OK start, but it’s not going to give you the business critical insights you’re looking for. To truly make your data come alive you need to mine it. Dig deep. Play around. And tease out the diamond in the rough.

Jumpstarting your data mining journey can be an uphill battle if you didn’t study data science in school. Not to worry! Few of today’s brightest data scientists did. So, for those of us who may need a little refresher on data mining or are starting from scratch, here are 38 great resources to learn data mining concepts and techniques.

All the best big data tools and how to use them

Written by:

There are thousands of Big Data tools out there. All of them promising to save you time, money and help you uncover never-before-seen business insights. And while all that may be true, navigating this world of possible tools can be tricky when there are so many options.

Which one is right for your skill set?

Which one is right for your project?

To save you some time and help you pick the right tool the first time, we’ve compiled a list of a few of our favorite data tools in the areas of extraction, storage, cleaning, mining, visualizing, analyzing and integrating.  

Feature release: New modal and bulk REST API

Written by:

Things are changing around here, and we think that’s a good thing! Every new feature or product update is designed to bring you more functionality and a better user experience. This one is no different. 

This latest release gives you a whole new way to access the API creation workflow and a new way to use Bulk Extract.

Using AWS Lambda and API Gateway to create a serverless schedule

Written by:

At import.io, we believe that the best DevOps is NoOps.

That doesn’t mean that we don’t like DevOps people. On the contrary, we think they rock!

But ideally, we want everyone to be able to automate their work wherever possible. Instead of spending time on repetitive and mundane tasks, we push those jobs onto computers – leaving ourselves more time to work on great features.

With that in mind, we are starting to adopt microservices patterns as we scale our engineering efforts, so that new features are now delivered as separate components from the main platform. This allows us to iterate on new functionalities quickly, test different technology stacks and involve people who do not have the inside knowledge about the platform in the development process.

Our latest project, Scheduled APIs (which will let you run your Import.io APIs on a schedule), gave us a chance to revise our technological stack, and have a look around at new paths for building long lasting components. Specifically, we used a set of new AWS solutions: Amazon Lambda and API Gateway.

Mind the wealth gap: Inequality and London’s Underground

Written by:

As with most big cities, London has a serious problem with income inequality. After seeing this great piece in the New Yorker on the spread of inequality across NYC’s subway system, we decided to create our own following London’s Underground. This interactive Tableau viz charts these shifts using data on median household income, from the […]

Oxfam making headlines

Written by:

Oxfam making Headlines The data team at Oxfam used data to show that Britain’s five richest families are worth more than the poorest 20% and lobby for policy change. Proving that just one simple data set can start something huge. Read more in The Guardian

Creating data visualisations with tableau

Written by:

Tableau The lovely Jewel from Tableau Public has created this stunning data viz with data she extracted from her local radio station. Have a play and see how popular your favorite artist is in Seattle. View Tableau Visualization

Using data for product development by Steven Sinofsky (Andreessen Horowitz)

Written by:

Want to know how successful companies like Microsoft got to where they are today? By using data in smart, innovative ways. 

In this interview with Editor-in-Chief of ReadWrite, Owen Thomas, Steven Sinofsky draws on his experience at Microsoft and Andreessen Horowitz to tell you how to use data to drive product development. This fun and engaging interview will teach you how to use data, what pitfalls to be aware of and how to align customer support and product development.