Nation Great Reads
Great Read

Artificial intelligence bot vs. the poker pros

An artificial intelligence program confounds poker pros with its inexplicable tactics

Doug Polk, one of the world's best poker players, shoveled egg whites into his mouth with a plastic fork and slurped unsweetened oatmeal from a paper cup, 13 days into the oddest tournament he has ever entered.

His opponent, Claudico, did not struggle with fatigue, mental breakdown or hunger, despite having played 80,000 hands over a two-week period, a schedule that was four times more rigorous than Polk's. Nor did Claudico crack dumb jokes or worry about maintaining a poker face.

Claudico has no face at all. It is a computer program, an artificial intelligence bot, and one of the savviest computer poker players in history.

Claudico and its immediate predecessor have defeated all the best computer programs, as well as a pretty good professional player. But Claudico had never battled elite professional players until this April, when its developers staged a two-week showdown at a downtown casino.

The Carnegie Mellon University programmers were eager to see how Claudico would measure up. Would it attain the status of Deep Blue, which defeated grandmaster Garry Kasparov at chess in 1997, or Watson, which crushed the top "Jeopardy!" players in 2011? Or would it reveal itself a hollow shell, unworthy of a seat at the green felt table?

The contest was part exhibition, part science experiment, and part test of humanity's limits. Lead scientist Tuomas Sandholm recruited four players recommended by top professionals to compete in a type of Texas hold 'em poker known as "heads-up, no-limit," a one-on-one game involving an especially complex array of betting strategies and choices.

"As a human, you always try to get in people's heads," said Bjorn Li, a 25-year-old from Hong Kong who played on the human team. "With this guy, you can't really do that, because he just plays according to his algorithms."

The players believed any success they might have against the machine would be fleeting. "We at least wanted a first-round victory," said Jason Les, a 29-year-old pro from Costa Mesa who took part in the tournament. "We know they'll eventually crush us."

Claudico was not programmed to play poker, per se. It was given the rules of poker. Then it spent months playing more hands against itself than humans have ever played, learning to navigate the dizzying number of possible game situations, a figure that exceeds the number of atoms in the universe, said Noam Brown, a graduate student who worked on Claudico.

For Brown and the other programmers, poker is the measuring stick, but not the goal.

They are really aiming to advance the field of artificial intelligence to fight cyberwars, perform negotiations and plan medical treatments, among other tasks that require complex decision-making with limited information.

Hold 'em poker, in this regard, offers a different challenge than chess or "Jeopardy!" because two cards are dealt facedown to each player; an opponent always has a large chunk of information missing. Five cards are then dealt face up for both players to use in forming their best potential poker hand.

During one of the final days of competition, metal bleachers set up for spectators at Rivers Casino were empty. Polk, 26, a brash former World Series of Poker champion originally from Pasadena, and Dong Kim, a spiky-haired 26-year-old from Seattle, sat at small tables in front of computer screens, clicking raises, calls and folds faster than most people could follow.

The large-screen televisions showing their every move might as well have been broadcasting a Soviet budget for all the excitement the competition was generating.

Rachelle Watson, a 31-year-old schoolteacher sipping a whiskey with a friend at the adjacent bar, did not notice what was going on a few feet away, despite banners billing the epic "Brains vs. Artificial Intelligence" showdown.

Watson had just lost $60 playing blackjack and slots, so the prospect of losing to a machine did not impress her. "They beat us every day," she said.

The other two pros, Les and Li, were playing their games in one of the casino's windowless back offices under fluorescent lights.

The casino put up half of the $100,000 prize money, with Microsoft footing the rest.

To win permission from the state gambling board, which does not allow online poker in Pennsylvania, the tournament was technically an exhibition, with each player earning a minimum $10,000 appearance fee and then splitting the remaining money depending on the order in which they finished.

They played from 11 a.m. to 10 p.m.; on many nights, they stayed up several hours studying the day's hands in hopes of deciphering Claudico's strategy and finding leaks in its game.

The bot had numerous ticks, some confounding and some annoying. It took a long time to decide how to bet on the final card, something these players are not accustomed to when playing online against fellow humans.

Claudico was designed to calculate which among the vast combinations of cards and bets is most likely to win the most chips. The bot also used unorthodox betting styles, sometimes wagering large amounts of cash when just a few chips were at stake, and betting little on hands that might elicit a larger wager from a pro.

"What!? He just check-called, check-raised," Polk shouted to Kim at one point, exasperated at the computer's pattern of passive and aggressive play.

Polk folded the hand, but took note. The bot risked all its available chips on one hand while holding a 10 and a 5 of different suits — very bad cards — and bet big on another hand when the chances that its opponent could make a full house or a flush were great.

"It has a very sophisticated model," said Sandholm, the lead developer. "It just doesn't know that it's bluffing because it doesn't know the word 'bluff.'"

Unlike professionals, Claudico did not track its opponents' strategies. And its own game seemed random.

"We're just very mindful that it does these bizarre things," Kim said.

The computer also plays without ego. It does not care if it wins respect or exploits a weak opponent.

"It has no feeling at all," Sandholm said.

Its name, Claudico, means "I Limp" in Latin, a reference to the fact that it does not mind calling a bet in a fashion that many professional poker players believe to be weak and foolish.

Polk is respectful. But he is not ready to concede an analytical advantage to the bot. Players like him are hardly playing from gut instinct. They normally use sophisticated software of their own while playing online and they maintain a log of their opponents' tendencies. They spend hours studying potential plays.

As the tournament neared its end, after tens of thousands of hands were completed, the pros became a bit loopy.

Li, at one point holding a pair of fours and staring into the computer, began calling playfully to his rival: "Claudico! Claudico!"

He twitched his fingers, hummed and scratched his head. The fidgeting was captured on a video stream that devoted poker fans could watch from home.

"We all just want to wrap up," Kim said,

By then, humans held a strong advantage. All but one, Les, owned an individual lead of at least $100,000 in chips over the bot.

Claudico seemed sure to lose, but his master put on a brave face.

"I was kind of hoping for this, that it wouldn't be a walkover," Sandholm said as the last hands were being dealt.

He insisted it was actually a draw — "not statistically significant."

Polk smiled and shook his head. With a few hundred hands left, the pros were leading by more than $700,000 in chips. They had won more chips from Claudico than they lost on nine of 13 full days of play.

"I'm not a scientist, so I can't really say this. But this is a pretty good win ratio," Polk said. "If the shoe was on the other foot … I don't think they'd have the same attitude: 'Oh, it's a tie.'"

Polk knew he couldn't get too cocky. A team at the University of Alberta announced in January that it had created a bot that had "essentially solved" Limit hold 'em, a less complex version of the game Claudico plays, that caps the amount a player can bet and raise at a given moment.

"Things are going to start rapidly improving," said Michael Bowling, the professor who led the Alberta team. Mastering the more complex "no-limit" game is one to three years away, he estimates.

Sandholm's team is already working on the next upgrade to Claudico, due out by November.

Polk is hedging his bets, teaching himself new games such as Omaha and Triple Draw. Mostly, he is trying to stay ahead of other players who he says are no longer willing to accept his challenges in no-limit Texas hold 'em. But he also knows the bots are coming.

"Right now, it looks pretty safe," he said, pointing to the scoreboard. "Humans are doing OK."

noah.bierman@latimes.com

Twitter: @noahbierman

Copyright © 2016, Los Angeles Times
66°