If you could read everything on Google News that mentioned artificial intelligence, what would you find?
You’d see a lot of stuff about humans and Google.
Instead of reading everything article-by-article, let’s do what a computer would and zoom out to find the major patterns. This job is made a lot easier since Mark Davies and BYU made the NOW corpus available last month. Scripts run nightly, so they get about 10,000 news articles each day and you can find results back to 2010. For the purpose of this post, I’m going to focus on the last year and a half.
Reading thousands of articles at a distance
From January 1, 2015 to June 1, 2016, there were 2,920 mentions of “artificial intelligence” across about 1,800 articles published in about 650 different publications. In case you’re worried, during that same time period, human occurred 39 times more often (112,458 total occurrences).
If you did a word cloud of the contexts immediately surrounding “artificial intelligence”, it would look something like this:
But word clouds are terrible. They lump everything together and all you have is size to indicate frequency. If you see two big words, you don’t really know whether they always appear together or never do.
Topic modeling, on the other hand, gives you structure—it attempts to keep words that co-occur together alongside each other.
If we put these articles into 10 clusters, we see a number of themes emerge. These are also basically what you get when you topic model with 12 or 15 clusters.
These are the major themes in press mentions of artificial intelligence for the last year and a half:
- Humans, humanity, people
- This is a more heterogeneous cluster than some of the others, so more on it below
- Businesses and machine learning, cloud services
- While there’s a lot of work in artificial intelligence happening in academia, most press is on what’s happening in industry
- If we ignore “artificial intelligence” itself and focus on “machine learning” instead, those articles are more likely to mention technical concepts like natural language processing, speech recognition, neural networks and deep learning
- Virtual assistants, Microsoft, Google, Facebook, Apple
- There’s a big rise in chatbots this year, especially due to the problems with Microsoft’s Tay learning to be awful. For what it’s worth, the first examples in this corpus are from Gizmodo Australia in 2011 about distinguishing sexy humans and sexy chatbots.
- Robots, automation, and the industrial revolution
- While there are lots of disembodied types of AI, an important segment of mentions talk about AI that can move around
- In topic modeling everything has to go somewhere so mentions of virtual/augmented reality get slotted in here, mostly as mentions of another kind of development focus
- Google’s game playing AlphaGo/DeepMind technology
- Self-driving cars
- Autonomous and semi-autonomous vehicles are a big trend in how AI is being used and if you look at journalists’ biographies, they often mention this as an explicit topic of interest
- Labs and researchers
- Marvin Minsky, one of the pioneers in AI died in January, which prompted a fair number of articles
- Weaponized AI and the end of us all (warnings from Stephen Hawking and Elon Musk)
- A bit more on this below
- Airwheel S9
- This is a self-balancing wheeled robot that has trajectory tracking and moving object tracking…but the majority of mentions come from one source, Press Release Rocket that doesn’t seem to edit headlines very carefully
- IBM Watson
- Watson doesn’t seem to be getting a lot of traction among major news sources lately—their current effort is to get developers using their platform
Is Google winning AI?
This year, an AI system bested a human expert at the complex game of Go. Go is considered even more complex than chess and Google’s AlphaGo team captured an enormous amount of press from this.
One way to understand how a term is getting used is to look at its collocates—what are the other words and phrases that keep recurring around a target term versus everything else in the corpus? We’ve been looking at a 25-word window but let’s narrow the field a bit and see what appears within a four-word span of “artificial intelligence”. The NOW corpus tool reports this in co-occurrence counts as well as through mutual information, a measure that estimates how unlikely it is that words appear near each other by chance. For reference, the MI in this corpus between salt and pepper is 7.34 and the MI between computer and laptop is 3.21.
Not surprising, a lot of journalists use the full term “artificial intelligence” to define AI so they can go ahead and use that nice, shorter term in the rest of the article. Comparing news about AI with news about everything else, there’s much more talk about Google, robotics, machine learning, automata, and prophetic. You also see prominent themes within AI like virtual (mostly reality but some assistants), robots, and automation.
Google dominates the immediate context of artificial intelligence. It appears within four words of the term 150 times. The next closest organization is Microsoft with 26 mentions in that window. MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) occurs in the tiny window 16 times; IBM appears 13 times.
For people following the field, Google really is prominent in their work on artificial intelligence. However, their dominance isn’t actually as strong as it appears in this data.
The main thing working in Google’s favor is that it’s had a great 2016 with DeepMind/AlphaGo winning against Lee Sedol. If we go back to just 2015, Google doesn’t appear as an important collocate with artificial intelligence.
This is despite the fact that Google has had plenty of other news around AI—like releasing Parsey McParseface. But other topics haven’t gotten nearly the press traction nor the tight association with artificial intelligence. This is not really a criticism, just an explanation that their great press can be tracked back to a single (important) effort.
This probably doesn’t (and shouldn’t) worry Google. Discussions around artificial intelligence tend to be fairly high-level or buzzwordy. One of the major themes that doesn’t get its own cluster is the competition between tech giants (and others) to win developers over to their tools for AI.
To the degree that these AI platforms are vying for technical eyeballs, it’s probably more important to win on other keywords like machine learning and specific techniques. If we look at mutual information for machine intelligence, Google is much lower than it is for artificial intelligence—29 mentions, MI of 3.04. Microsoft is still in second place, with 16 mentions and 3.77 MI. Happily for Microsoft, these mentions tend to be around Azure Machine Learning and other products and not about Tay.
About 16% of the documents mention human/humans/humanity/person/people within 25 words of artificial intelligence. There’s a fair amount of diversity in this cluster and in many ways it is really just a superset of all the other issues. But broadly, one of the preoccupations of journalists, companies, and researchers is how AI systems perform relative to humans, which is of course the story of the Turing test, self-driving cars, and AlphaGo.
AI is often said to approximate human cognitive structures. Neural nets and deep learning are commonly described in terms of modeling the human brain. This is part of what’s behind IBM’s brand focus on cognitive computing, which they tend to use in favor of artificial intelligence.
Whether or not you want to claim that AI systems are anything like human brains, it is clearly the case that you want to affect humans. And in general AI efforts are meant to help people (e.g., Facebook captions for the blind or Google working on daily tasks). This is part and parcel of the big current focus on chatbots/virtual assistants, which also get their own cluster.
And while there are those who look forward to AI replacing particular kinds of chores and jobs, there is of course a whole cluster of articles devoted to DOOM and/or the attempt to pacify fears of robot monsters replacing us all entirely.
In his 1986 book, The Society of Mind, Minsky described intelligence in terms of lots of smaller, diverse parts. The cluster around virtual assistants is partly about helping with quotidian tasks with lots of different “agents” doing different things. But it’s also about the competition to own the platforms that developers use to build even more services.
The themes from the last 18 months show us what’s possible with greater amounts of data and processing power. Next week, we’ll zoom out to put these trends in historical context. AI researchers talk about AI winters in which they wouldn’t even call what they were doing artificial intelligence. Are we in an AI springtime? Is winter coming?