What is Artificial Intelligence? Louis Monier explains everything.

What is Artificial Intelligence?

Our Chief Scientist Louis Monier gives you the straight dope on AI.

Artificial Intelligence, always a very polarizing subject, is back on top of the news. Unless you have been on a deep space mission for the past year, you have been exposed to opinions ranging from “this will change everything for the better” to “this will spell our doom”.

But what are the facts? What is Artificial Intelligence? Why is it seeing a resurgence now? How can I benefit from it? How does it affect my business? This is the first in a series of posts where we answer the most commons questions about Artificial Intelligence.

So what is Artificial Intelligence?

Put simply Artificial Intelligence, or AI, is the study of tasks that are effortless for humans but very difficult for machines. Let’s think of what we can accomplish within the first few years of our lives:

  • We can recognize people and objects, and pick up a voice out of a noisy background.
  • We are on our way to mastering a language by our third year.
  • We acquire a lot of common sense, we know a big toy won’t fit inside a small box, that if you play with water you’ll get wet, and that your friends will get mad if you don’t share toys with them.
  • We learn that to stack cubes they need to be pretty much on top of each other, or they will fall.

These are examples of perception, language understanding, reasoning, and planning. None of this is easy to describe to a computer.

The shortest history of Artificial Intelligence

Back in the fifties, when the term was first coined, AI was imagined to be a game of logic: we would encode rules, and somehow the program would exhibit intelligent behavior. With enough rules we were going to have autonomous robots driving around Mars, and programs to translate Russian newspapers into English. It didn’t happen. Writing enough rules for any practical application turned out to be daunting, as we needed rules about when to apply the rules, rules about exceptions to the rules, and rules for common sense which require describing the minutiae of just about everything in our world. Rule-based AI disappointed several times over the past sixty years, with cycles of hype ending in so called “AI Winters”.

Meanwhile a small group of renegade researchers were exploring a very different approach: start from data and use statistics to extract patterns and relationships. Known as Machine Learning, this domain was a footnote to mainstream AI for a long time, while rule-based AI was dogma. Encouraging results led to further experimentation, and sometime around 2012 a subset of Machine Learning, dubbed Deep Learning, exploded with practical applications. Today, when an article mentions recent developments in AI, it’s almost always about Deep Learning.

What makes Deep Learning tick?

At the cocktail party level, Deep Learning is the art of running a massive amount of data through a large network of tiny pieces of code –known as artificial neurons– which only differ by specific numbers, called weights, they each hold. Starting from random weights and guided by precise math, the network absorbs a large number of examples, which can be text, images, voice samples, or just about anything that can be quantified. With each sample a little bit of learning takes place as the weights change ever so slightly. After millions of examples, the network is trained for a specific task. The “learning” is not the green fluorescent serum of movies, it’s simply a large set of numbers. The training process can be lengthy, but once a network is trained, it can do its job very efficiently: it can take weeks to train a large network to recognize thousands of types of objects, but the trained network then takes milliseconds to process an image.

The basic neurons are so simple that high-school math can describe their function. These neurons are arranged in regular patterns, typically in layers. The “deep” in Deep Learning refers to the number of layers the data has to cross. In the early days there were only a couple of layers. Today there can be 1000 layers. Next year?  Who knows!  

As data makes its way through the layers, simple patterns become more complex and capture increasingly abstract concepts. For example starting from the many pixels that make up an image, the first layer extracts very simple patterns, like short line segments; the next layer combines these to form something like an edge or a rounded corner; next layer we might find star patterns, or circles, or spirals; then eyes, legs, honeycomb patterns; then faces; then specific animals or objects, and so on. As a rule, the more layers, the more powerful the network.

Collectively, a network of millions or billions of artificial neurons exhibits a hard-to-explain ability to learn, and routinely surpasses human performance at specific tasks. Nobody is surprised that computers beat humans at arithmetic, but we now have to come to terms with the fact that they are better than us at understanding an accented voice in a noisy environment, or at distinguishing specific species of birds. Welcome to the future!

Something remarkable is that these networks basically all use the same components, the same math, and the same learning routines. Still they can act on text or images or voice, without any specific knowledge about these domains. The only difference is the arrangement of the neurons. Today it’s still a bit of an art to know which network configuration (which arrangement of neurons) will do well on a particular problem.

Deep Learning is behind applications that we take for granted, but would have been science-fiction only a decade ago: voice input that actually works; recognizing the face of a billion people; tagging and even writing captions for photos; self-driving cars; chatbots that you don’t mind interacting with. Standing back a bit, one can see that progress has been the most impressive around perception: the ability to make sense of images, to go from sound waveforms to sentences, and to grasp more about human languages. This is immense progress, but to keep things in perspective, we are very far from any sort of general intelligence, and consciousness is not part of any serious conversation at this point.

The main flavors of Deep Learning

Deep Learning comes in several flavors, here are the three main ones.

Supervised learning is learning to achieve a goal from examples. Feed a neural network millions of tagged photos (this is a cat, this is a bird…) and it can spot a cat in new images. Or a bird. Or 20,000 other things. This is how Google Photos can search your family photos without requiring you to painstakingly add “kayak”, “lake”, “vacation” or “sunglasses” to each photo. It learned from the billions of tagged photos that it found on the Web.

Reinforcement learning is similar, but the network gets its training data by interacting with a system. Think about how you get better at a game by playing it. AlphaGo –the DeepMind program that beat the best Go player in the Spring of 2016– uses reinforcement learning. So do some components of a self-driving car, who learn to play the game of “stay in the lane and don’t run into anything” by interacting with a driving simulator.

Unsupervised learning has no rules. The network is given a large amount of data and must find interesting correlations. An experiment by Google in 2010 famously discovered that there are indeed lots of pictures of cats (and people) on YouTube, without ever being told what a cat looks like. Similarly it is now common to feed a large amount of text (for example the content of Wikipedia, or millions of news stories) to a neural network and have it derive which words are similar in meaning.

Most commercial applications today use supervised learning: rather than coding what the machine should do, you feed it lots of examples and it figures it out. Sounds easy, all you need is data! Interestingly, babies don’t learn this way. They seem to use unsupervised learning to learn to recognize voices (for example associating special types of sounds with a friendly face) and reinforcement learning to interact with people and objects around them (like doing this thing with their face that gets a positive reaction from the big people).

In our next post we’ll explore what makes Deep Learning tick, and the true secret behind it. To use Deep Learning, you don’t need a PhD., you don’t need expensive software or your own data center, most of what you need is freely available, and plenty of knowledgeable people are willing to point you in the right direction. The hard part, the true secret, is that you need data.