Prediction Machines

If I have seen further than others, it is by standing upon the shoulders of Giants.

What I learned from “Prediction Machines: The Simple Economics of Artificial Intelligence”
By Robert Boespflug, robert@apn.ai
July 29th, 2018 - 8 min read

I recently finished reading Prediction Machines: The Simple Economics of Artificial Intelligence, a book about what artificial intelligence (AI) really means from the perspective of economists. It was written by 3 economists from the University of Toronto: Ajay Agrawal, Joshua Gans and Avi Goldfarb. I believe the book to be stunning in regards to its clarity on what AI really is and what it means. Here are some of the most important takeaways from the book.

Why is it important to look at AI through the lens of economics?

Economists are good at ignoring the hype and noise surrounding a new technology. They don’t care if we call an economy the “New Economy” because of inventions like the desktop computer, the internet, or AI. The tools in an economist’s toolbox remain the same, things like Supply and Demand Curves and Price Elasticity for instance. Economists look at a new technology and merely try to figure out what it drops the cost of. And its impact on other costs. It was of course very transformational and exciting when desktop computers hit the market in the 1980’s. The could do all sorts of cool things with killer apps like spreadsheets (e.g. Visicalc, Lotus 123, Excel). They grew stronger every 2 years too via Moore’s Law. To an economist however, desktop computers merely did one thing. It dropped the cost of arithmetic. In fact, arithmetic got so cheap that we took problems that weren’t math problems and converted them to math problems. Photography for instance used to be a chemistry problem before we figured out how to convert one’s and zero’s into pictures with math. The internet simply dropped the marginal cost of search, communication, and the transmission of digital products and services down to zero.
Today the new big thing is AI. If you went to the Consumer Electronics Show in Las Vegas this year you were probably amazed. Drones, robots, virtual reality, 3D printing, sensors galore….astonishing in sophistication and scale. To an economist however, she is really thinking only about the part of AI that matters. Machine Learning and it’s powerful subset called Deep Learning. To an economist, that is what AI really is and it is in a class all by itself. The reason is Deep Learning is a General Purpose Technology (GPT). Semiconductors for instance are also a GPT. They are called that because they are a foundational input that impacts so many different things.

Most people are now aware that a computer beat a champion human GO player a couple of years ago. It was in March, 2016 when Google’s Deep Mind pitted its Deep Learning machine AlphaGo against one of the very best Go players in the world Lee Sedol. AlphaGo won 4-1. It was a “Sputnik” moment for China and caused them to make major investments in this new “alien technology”. AlphaGo since then has gotten orders of magnitude stronger. Here is a useful definition of Deep Learning:

Deep Learning is an artificial intelligence function that imitates the workings of the human brain in processing data and creating patterns for use in decision making. Deep learning is a subset of machine learning in Artificial Intelligence (AI) that has networks capable of learning unsupervised from data that is unstructured or unlabeled.

Some experts call this Deep Learning names like Intuition Machines because of the way the technology mimics some of the ways a human brain works. Some call this Deep Learning the “new electricity”. The economists have come up with a better name. They call Deep Learning “Prediction Machines”. Basically, AI simply drops the cost of prediction. And prediction affects everything. And almost everything can be converted into a prediction problem.

What is Prediction?

PREDICTION: USING INFORMATION THAT YOU DO HAVE TO GENERATE INFORMATION THAT YOU DO NOT HAVE.

Examples of areas where we use AI today for prediction include:

Customer Churn
Demand Forecasting
Image Classification
Translation
Supply Chain Management
Drug Discovery
Insurance
Autonomous Cars

Note that translation used to be a rules-based problem and we converted it into a prediction problem. We did the same with self driving cars back in 2012. People forget that in 2012, computer engineers predicted that we would not have self driving cars in our lifetimes. The programming was just too complex...to many “if/then” statements to deal with. Then we reframed the problem into a prediction problem and simply asked the self driving car to do one thing: predict what a human driver would do. At first self driving cars didn’t know what to do. Then they became awesome...very fast...better than humans.

Much discussion about AI emphasizes the variety of prediction techniques using increasingly obscure names and labels: classification, clustering, regression, decision trees, Bayesian estimation, neural networks, topological data analysis, deep learning, reinforcement learning, deep reinforcement learning, capsule networks, and so on. The techniques are important for technologists interested in implementing AI for a particular prediction problem. The authors emphasize that each of these methods is about prediction: using information you have to generate information you don’t have. The book identifies situations in which prediction will be valuable, and then on how to benefit as much as possible from that prediction. Cheaper prediction will mean more predictions. This is simple economics: when the cost of something falls, we do more of it.

Many problems have transformed from algorithmic problems (“what are the features of a cat?”) to prediction problems (“does this image with a missing label have the same features as the cats I have seen before?”). Machine learning uses probabilistic models to solve problems. So why do many technologists refer to machine learning as “artificial intelligence”? Because the output of machine learning—prediction—is a key component of intelligence, the prediction accuracy improves by learning, and the high prediction accuracy often enables machines to perform tasks that, until now, were associated with human intelligence, such as object identification. Prediction Machines are scalable and will continue to get much better, faster, and cheaper.

Cheap Prediction increases the value of human judgment and creates better jobs.

The press has it largely right when they talk about data being the new oil. But the authors believe they have it quite wrong when discussing doomsday scenarios, future dystopian societies and robots taking human jobs away. To explain why the press has it wrong it is instructive to understand a concept in Economics called Cross Price Elasticity. For example, when the price of coffee falls, the value of the “Compliments” of coffee (cream and sugar) rises. So of course we see this because when the price of coffee falls we consume more of it. But we still need to put that cream and sugar in it. As it relates to coffee, tea is a “Substitute”. As coffee prices fall relative to tea, some tea drinkers switch from tea to coffee which causes the value of tea to drop. Thus, the value of “Compliments” to something rise and the value of “Substitutes” falls.

To explain how this concept works in terms of Prediction Machines and human judgment let’s turn to a very powerful graph provided by the authors...Anatomy of a Task.

Anatomy of a Task

First off, our Prediction Machine is front and center in the middle rectangle labeled Prediction. Remember that Prediction machines are so valuable because (1) they can often produce better, faster and cheaper predictions than humans can; (2) prediction is a key ingredient in decision making under uncertainty, and (3) decision making is ubiquitous throughout our economic and social lives. However, a prediction is not a decision--it is only a component of a decision. The other components are judgment, action, outcome, and three types of data (input, training, and feedback). Note that these 6 other components are all Compliments to Prediction Machines. Thus, as prediction gets better, faster, cheaper….the value of these components goes up. Human prediction is a Substitute and thus its value will go down. However, again, the value of Compliments , such as the human skills associated with data collection, judgment, and actions, will become more valuable.

Let’s start with explaining the data components. The current generation of AI technology is called “machine learning” for a reason. A Prediction Machines (which are a subset of AI technology called “machine learning”) utilize three types of data: (1) training data for training the AI, (2) input data for predicting, and (3) feedback data for improving the prediction accuracy. Data collection is costly; it’s an investment. The cost of data collection depends on how much data you need and how intrusive the collection process is. It is critical to balance the cost of data acquisition with the benefit of enhanced prediction accuracy. Determining the best approach requires estimating the ROI of each type of data: how much will it cost to acquire, and how valuable the associated increase in prediction accuracy will be.

By breaking down a decision into its components we can understand the impact of prediction machines on the value of humans and other assets. Judgment involves determining the relative payoff associated with each possible outcome of a decision, including those associated with “correct” decisions as well as those associated with mistakes. Judgment requires specifying the objective you’re actually pursuing and is a necessary step in decision making. As prediction machines make predictions increasingly better, faster, and cheaper, the value of human judgment will increase because we’ll need more of it. We may be more willing to exert effort and apply judgment to decisions where we previously had chosen not to decide (by accepting the default).

At this juncture, it is important to again remember “data is the new oil”. And thus, since data is a compliment to prediction, the value of this oil will increase. But note that actions and outcomes are also compliments to prediction. Ajay Agrawal has mentioned in public presentations during his Prediction Machine book tour that actions and outcomes and the all important feedback data they create should be considered the the “new gold dust”. Companies own this gold dust. And it is very valuable because it is this gold dust that makes the Prediction Machines better, faster, stronger. It is instructive to note that there are a growing number of AI labs around the world (Ajay in fact is a founder of Creative Destruction Labs in Toronto) that have created their AI algorithms (Prediction Machines) with training data. But without the gold dust they cannot improve or optimize them.

Prediction Machines and Strategy

Is it 1995 all over again?

The authors ask us to consider the story of the commercial internet in 1995. Microsoft released Windows 95, its first multitasking operating system. That same year, the US government removed the final restrictions to carrying commercial traffic on the internet, and Netscape—the browser’s inventor—celebrated the first major initial public offering (IPO) of the commercial internet. Bill Gates in fact issued his famous “Internet Tidal Wave” internal memo in May of that year instructing his minions to “think internet in all things”. This marked an inflection point when the internet transitioned from a technological curiosity to a commercial tidal wave that would wash over the economy.Netscape’s IPO valued the company at more than $3 billion, even though it had not generated any significant profit. Venture capital investors valued startups in the millions even if they were, and this was a new term, “per-revenue.”

The rise of the internet was a drop in the cost of distribution, communication, and search. Reframing a technological advance as a shift from expensive to cheap or from scarce to abundant is invaluable for thinking about how it will affect your business. However, what might be affected when a new technology makes something cheap is not always precisely obvious, whether the technology is artificial light, steam power, the automobile, or computing. No one could imagine Google, Facebook, Amazon, Uber, Airbnb, etc. in 1995. As mentioned above, the advent and commercialization of computers made arithmetic cheap. When arithmetic became cheap, not only did we use more of it for traditional applications of arithmetic, but we also used the newly cheap arithmetic for applications that were not traditionally associated with arithmetic, like photography. When the cost of arithmetic fell low enough, there were thousands of applications for arithmetic that most had never dreamed of. Arithmetic was such an important input into so many things that, when it became cheap, just as light had before, it changed the world. Reducing something to pure cost terms has a way of cutting through hype, although it does not help make the latest and greatest technology seem exciting. AI will be economically significant precisely because it will make something important much cheaper.

What did we learn?