But as simple as the rules are, Go is a game of profound complexity. The search space in Go is vast – more than a googol times larger than chess (a number greater than there are atoms in the universe!). As a result, traditional “brute force” AI methods – which construct a search tree over all possible sequences of moves – don’t have a chance in Go. To date, computers have played Go only as well as amateurs. Experts predicted it would be at least another 10 years until a computer could beat one of the world’s elite group of Go professionals.
We saw this as an irresistible challenge! We started building a system, AlphaGo, described in a paper in Nature this week, that would overcome these barriers. The key to AlphaGo is reducing the enormous search space to something more manageable. To do this, it combines a state-of-the-art tree search with two deep neural networks, each of which contains many layers with millions of neuron-like connections. One neural network, the “policy network”, predicts the next move, and is used to narrow the search to consider only the moves most likely to lead to a win. The other neural network, the “value network”, is then used to reduce the depth of the search tree – estimating the winner in each position in place of searching all the way to the end of the game.
David Silver and Demis Hassabis
Fascinating – and rather unexpected – development in the field of artificial intelligence: an algorithm that can consistently best human players at Go, the only remaining deterministic game where humans have (had?) the upper hand. I must admit, I am starting to understand why important people are getting worried that AI research is moving too fast and that the world is ill prepared for the rapid changes it will bring…