- Go is far more challenging for computers than, say, chess for two reasons: the number of potential moves each turn is far higher, and there is no simple way to measure material advantage. A player must therefore learn to recognize abstract patterns in hundreds of pieces placed across the board. And even experts often struggle to explain why a particular position seems advantageous or problematic.
- “Go is the most complex and beautiful game ever devised by humans,” Demis Hassabis, head of the Google team, and himself an avid Go player, said at a press briefing. By beating Fan Hui, he added, “our program achieved one of the long-standing grand challenges of AI.”
- And this March it will take on one of the world’s best players, Lee Sedol, in a tournament to be held in Seoul, South Korea.
This is huge news. Here's Yudkowsky: This is. "Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves... Without any lookahead search, the neural networks play Go at the level of state-of-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0." Repeat: IT DEFEATED THE EUROPEAN GO CHAMPION 5-0. As the authors observe, this represents a break of at least one decade faster than trend in computer Go. This matches something I've previously named in private conversation as a warning sign - sharply above-trend performance at Go from a neural algorithm. What this indicates is not that deep learning in particular is going to be the Game Over algorithm. Rather, the background variables are looking more like "Human neural intelligence is not that complicated and current algorithms are touching on keystone, foundational aspects of it." What's alarming is not this particular breakthrough, but what it implies about the general background settings of the computational universe. To try spelling out the details more explicitly, Go is a game that is very computationally difficult for traditional chess-style techniques. Human masters learn to play Go very intuitively, because the human cortical algorithm turns out to generalize well. If deep learning can do something similar, plus (a previous real sign) have a single network architecture learn to play loads of different old computer games, that may indicate we're starting to get into the range of "neural algorithms that generalize well, the way that the human cortical algorithm generalizes well". This result also supports that "Everything always stays on a smooth exponential trend, you don't get discontinuous competence boosts from new algorithmic insights" is false even for the non-recursive case, but that was already obvious from my perspective. Evidence that's more easily interpreted by a wider set of eyes is always helpful, I guess. Next sign up might be, e.g., a similar discontinuous jump in machine programming ability - not to human level, but to doing things previously considered impossibly difficult for AI algorithms. I hope that everyone in 2005 who tried to eyeball the AI alignment problem, and concluded with their own eyeballs that we had until 2050 to start really worrying about it, enjoyed their use of whatever resources they decided not to devote to the problem at that time.People occasionally ask me about signs that the remaining timeline might be short. It's very easy for nonprofessionals to take too much alarm too easily. Deep Blue beating Kasparov at chess was not such a sign. Robotic cars are not such a sign.
What happens when there's nothing else we can do better....? Always disheartening to see this kind of news to me.
sci-hub link. This is cool, but not nearly as profound as Yudkowsky thinks it is. It's a cool metaheuristic for Go, but state space search is state space search.
I am not sure what to think. The fact that AI researchers sound very impressed is impressive. c_hawkthorne commented on an earlier post, less than two years old. The article profiles Rémi Coulom, who has been working on board game AI for decades. Even with Monte Carlo, another ten years may prove too optimistic. And while programmers are virtually unanimous in saying computers will eventually top the humans, many in the Go community are skeptical. “The question of whether they’ll get there is an open one,” says Will Lockhart, director of the Go documentary The Surrounding Game. “Those who are familiar with just how strong professionals really are, they’re not so sure.” Of AlphaGo, Coulom says “This is a really big result, it’s huge.” The Nature article c_hawkthorne mentioned makes it sound like the software wasn't specific to go: But later it does sound like fairly focused training. The software was already competitive with the leading commercial Go programs, which select the best move by scanning a sample of simulated future games. DeepMind then combined this search approach with the ability to pick moves and interpret Go boards — giving AlphaGo a better idea of which strategies are likely to be successful.I ask Coulom when a machine will win without a handicap. “I think maybe ten years,” he says. “But I do not like to make predictions.” His caveat is a wise one. In 2007, Deep Blue’s chief engineer, Feng-Hsiung Hsu, said much the same thing. Hsu also favored alpha-beta search over Monte Carlo techniques in Go programs, speculating that the latter “won’t play a significant role in creating a machine that can top the best human players.”
It's a cool metaheuristic for Go
The IBM chess computer Deep Blue, which famously beat grandmaster Garry Kasparov in 1997, was explicitly programmed to win at the game. But AlphaGo was not preprogrammed to play Go: rather, it learned using a general-purpose algorithm that allowed it to interpret the game’s patterns, in a similar way to how a DeepMind program learned to play 49 different arcade games.
It first studied 30 million positions from expert games, gleaning abstract information on the state of play from board data, much as other programmes categorize images from pixels. Then it played against itself across 50 computers, improving with each iteration, a technique known as reinforcement learning.
Oh, it's exciting, because it's a big state space and good metaheuristics for big state spaces are exciting. Having too large a space to search is the thing that makes AI applications hard. I don't mean to belittle it, just to object to the passage flagamuffin quoted extrapolating out to AI Eschaton. It learns to explore game trees, without exploring the whole intractably large tree but without skipping too many of the good branches either; you can probably teach it to play chess and checkers too. You are not going to teach it to solve differential equations, unless you want to try to represent solving differential equations as a two-player strategy game. Like every learning algorithm now or in the future, it learns a particular class of function.