First they figured out how to play checkers and backgammon. Then they mastered chess, Go, “Jeopardy!” and even a few Atari video games. Now computers can challenge humans at the poker table — and win.
DeepStack, a software program developed at the University of Alberta’s Computer Poker Research Group, took on 33 professional poker players in more than 44,000 hands of Texas hold ’em. Overall, the program won by a significantly higher margin than if it had simply folded in each round, according to a new study in Science.
Poker is vastly more complex than games previously mastered — or “solved” — by artificial intelligence, such as checkers, chess and Go. In those simpler games, both players have identical information about the state of the game. But that’s not the case with poker, because players keep their cards secret and take turns bluffing and betting on who has the better hand.
This presents a challenge for computers, which until recently had trouble coping with such uncertainties.
DeepStack’s game of choice is a version of poker called heads-up, no-limit Texas hold ’em, a two-player game that allows for unlimited betting amounts.
A game can last up to four rounds. Players are first dealt a two-card hand, which is kept private. In the latter three rounds, the dealer draws five cards — the flop, the turn and the river — that both players can use. The goal is to come up with the strongest five-card combination.
In each round, players can do one of four things. They can check, or stand pat for the time being; bet, which is placing a wager or matching the same amount as previous players; raise the bet, forcing others to do the same if they want to stay in the round; or fold, which is poker-speak for dropping out of the hand.
With practice, humans can easily learn the rules of the game. But for a computer, heads-up no-limit is dizzyingly complex.
The game involves about 10160 (a 1 followed by 160 zeroes) decision points. That’s more unique scenarios than there are atoms in the universe.
Computers managed to ”solve” games such as checkers by calculating an unbeatable strategy before a match even starts. That approach doesn’t quite work for poker.
In the card game, you make your decision based on the odds that your opponent has a good hand. You study their actions for clues about their cards.
All the while, your opponent is studying you. Their decisions will depend on what they believe about your hidden cards, as well as what your actions reveal about the strength of your hand.
To master this kind of recursive face-off, DeepStack doesn’t even try to pre-strategize the entire match. Instead, the program focuses on a particular situation as it comes up in the game, only looking a few actions ahead. When the circumstances of the game change, so can DeepStack’s strategy.
“It will do its thinking on the fly while it’s playing,” Bowling said.
DeepStack plays poker like an experienced human player. Bowling and his colleagues “trained” the program by pitting it against itself in millions of randomly generated poker situations. That’s given it a kind of robot-version of intuition — what Bowling described as a “gut feeling.”
“It can actually generalize situations that it’s never seen before,” he said.
DeepStack’s training process uses the kind of “deep learning” technology that powers Apple’s Siri voice recognition system and enables self-driving cars to recognize the difference between road signs and hazards. The algorithm feeds its training data into a deep neural network, which it then draws from to match with in-game situations.
The result is a poker player that never tires during marathon matches, bets more aggressively than any human would dare and runs on a laptop.
To test its chops, the researchers invited 33 professionals from the International Federation of Poker to play against DeepStack. Each of the pros was asked to play 3,000 hands over the course of a month.
Only 11 players completed all 3,000 hands. Among them, 10 were defeated so badly that their losses could not be written off as a statistical fluke. Indeed, the program won by a margin roughly eight times greater than what a professional human would consider good.
The 11th player, Martin Sturc, also lost to the machine — though the margin was too small to be statistically significant.
“After a couple of hands, I realized that DeepStack has a very solid poker strategy, although he also made some moves that are not really common in the ‘real poker world,’” said Sturc, who is based in Austria. “I guess that some plays represent the way a hand should be played optimally, but humans have just not figured it out yet.”
Sturc said the experience will help him refine his game by prompting him to reconsider his go-to strategies and think outside the box.
But to the researchers, DeepStack is much more than a high-tech poker teacher.
The program’s ability to develop hard-to-beat strategies could be useful in the realms of national security and medicine, Bowling said.
For instance, doctors coming up with regimens to manage diabetes must account for changes in a patient’s diet, stress and metabolism. In other words, the strategy must be robust to uncertainty, just as in a game of poker.
For now, DeepStack can challenge only one opponent at a time. In the next few years, however, Bowling hopes to develop artificial intelligence that can play against multiple opponents.
MORE IN SCIENCE