Twitter Hires Humans To Do the Work of Machines

Twitter has done a lot to change the way people share news, watch live television, and even how they protest. Along the way it’s had to deal with scaling problems as well as the challenge of tracking how users connect with each other, and how to then instantly—or at least very quickly—deliver the right tweets to the right streams.

To do this, Twitter has built some impressive tools, such as Scalding, Storm, and FlockDB. Now, as it attempts to sell ads and make money operating its popular service, Twitter faces a problem that has for decades vexed people who deal with computers: How do you get a machine to understand what people are saying or typing? Not just recognizing the words (although in speech-recognition programs that can still be tough), but to understand that when they are talking about “cancer,” they might be referring to either a disease or an astrological sign.

Many people view this as a problem for machines—another step in the path to true artificial intelligence—and have taken elaborate efforts to solve the problem of translating human thought into something a machine can parse. On Twitter, where an inane comment during a presidential debate can result in a #bindersfullofwomen hashtag, how does a computer recognize what that means and what type of ads to place against it?

The answer is that it doesn’t. So Twitter instead turns to real people, via an automated process it described in a blog post published on Tuesday. It has essentially automated a query to Amazon’s (AMZN) Mechanical Turk service that bids out jobs to real people.

From the Twitter post:

Suppose that our Storm topology has detected that the query [Big Bird] is suddenly spiking. Since the query may remain popular for only a few hours, we send it off to live humans, who can help us quickly understand what it means; this dispatch is performed via a Thrift service that allows us to design our tasks in a web frontend, and later programmatically submit them to Mechanical Turk using any of the different languages we use across Twitter.

On Mechanical Turk, judges are asked several questions about the query that help us serve better ads.

There’s been a lot of effort to mimic the human brain in computers, but perhaps the most optimal way to take advantage of people is to recognize what we do well and find cheap ways to optimize our brain’s computational powers—not via replication in silicon, but by using computers to outsource the task to the most appropriate, cheapest, nearest, or whatever people. Twitter has done this, but it’s not alone.

For example, ZestFinance, a company using data analysis to offer people credit, tweaked its credit-scoring model to include humans to help determine which variables might really matter in a particular person’s scoring model. All told, about 25 percent of the variables the company analyzes are the result of human intervention, but it’s the mix of humans and the existing data analytics that make the combination so powerful.

Another example of this combination is Gravity Labs, which uses people plus machine learning to construct interest graphs. When you combine people with better databases, faster computers, or task-optimized systems such as Siri or Watson, you have a more realistic version of artificial intelligence than some self-learning and thinking robot. It also drives home one of the more subtle aspects of the Big Data revolution that my colleague Derrick Harris pointed out earlier this month.

Data analytics in many cases is more about automation than insights.

Some of the best uses of advanced databases or data visualizations is in narrowing down what might be thousands or millions of variables into something that can be assessed by a person and then acted on. In Twitter’s case, the computers can handle the recognition of a hashtag’s spiking popularity, but a quick call to a person can tell you why and what that hashtag means far more quickly and cheaply than a computer could. A person, however, can’t filter through billions of tweets to see what’s spiking and what isn’t.

Thus, perhaps the most practical AI—for now—is recognizing what humans can do and getting them the best, most compact information that allows them to make their decisions. Why replicate the brain if you don’t have to?

Also from GigaOM:

Social Networkers Survey: How to Compete with Facebook in 2013 (subscription required)

A Peek Inside China’s Internet Giants and Their Massive Scale

Can LTE-Broadcast Dam the Mobile Video Deluge?

Google Invests $200M in Texas Wind Farm, Backed a Hoover Dam Worth of Clean Power

FCC Set to Release More Spectrum to Feed Our Need for Wi-Fi