I am not sure what to think. The fact that AI researchers sound very impressed is impressive. c_hawkthorne commented on an earlier post, less than two years old. The article profiles Rémi Coulom, who has been working on board game AI for decades. Even with Monte Carlo, another ten years may prove too optimistic. And while programmers are virtually unanimous in saying computers will eventually top the humans, many in the Go community are skeptical. “The question of whether they’ll get there is an open one,” says Will Lockhart, director of the Go documentary The Surrounding Game. “Those who are familiar with just how strong professionals really are, they’re not so sure.” Of AlphaGo, Coulom says “This is a really big result, it’s huge.” The Nature article c_hawkthorne mentioned makes it sound like the software wasn't specific to go: But later it does sound like fairly focused training. The software was already competitive with the leading commercial Go programs, which select the best move by scanning a sample of simulated future games. DeepMind then combined this search approach with the ability to pick moves and interpret Go boards — giving AlphaGo a better idea of which strategies are likely to be successful.I ask Coulom when a machine will win without a handicap. “I think maybe ten years,” he says. “But I do not like to make predictions.” His caveat is a wise one. In 2007, Deep Blue’s chief engineer, Feng-Hsiung Hsu, said much the same thing. Hsu also favored alpha-beta search over Monte Carlo techniques in Go programs, speculating that the latter “won’t play a significant role in creating a machine that can top the best human players.”
It's a cool metaheuristic for Go
The IBM chess computer Deep Blue, which famously beat grandmaster Garry Kasparov in 1997, was explicitly programmed to win at the game. But AlphaGo was not preprogrammed to play Go: rather, it learned using a general-purpose algorithm that allowed it to interpret the game’s patterns, in a similar way to how a DeepMind program learned to play 49 different arcade games.
It first studied 30 million positions from expert games, gleaning abstract information on the state of play from board data, much as other programmes categorize images from pixels. Then it played against itself across 50 computers, improving with each iteration, a technique known as reinforcement learning.