Friday, February 03, 2017

Hail Libratus! AI beats human pros in no-limit Texas Hold'em



AI already dominates humans in any narrowly defined task. Perhaps another 30-50 years until AGI?
IEEE Spectrum: Humanity has finally folded under the relentless pressure of an artificial intelligence named Libratus in a historic poker tournament loss. ...

Libratus lived up to its “balanced but forceful” Latin name by becoming the first AI to beat professional poker players at heads-up, no-limit Texas Hold'em. The tournament was held at the Rivers Casino in Pittsburgh from 11–30 January. Developed by Carnegie Mellon University, the AI won the “Brains vs. Artificial Intelligence” tournament against four poker pros by US $1,766,250 in chips over 120,000 hands (games). Researchers can now say that the victory margin was large enough to count as a statistically significant win, meaning that they could be at least 99.98 percent sure that the AI victory was not due to chance.

... the victory demonstrates how AI has likely surpassed the best humans at doing strategic reasoning in “imperfect information” games such as poker. The no-limit Texas Hold’em version of poker is a good example of an imperfect information game because players must deal with the uncertainty of two hidden cards and unrestricted bet sizes. An AI that performs well at no-limit Texas Hold’em could also potentially tackle real-world problems with similar levels of uncertainty.

“The algorithms we used are not poker specific,” Sandholm explains. “They take as input the rules of the game and output strategy.”

... Libratus played the same overall strategy against all the players based on three main components:

First, the AI’s algorithms computed a strategy before the tournament by running for 15 million processor-core hours on a new supercomputer called Bridges.

Second, the AI would perform “end-game solving” during each hand to precisely calculate how much it could afford to risk in the third and fourth betting rounds (the “turn” and “river” rounds in poker parlance). Sandholm credits the end-game solver algorithms as contributing the most to the AI victory. The poker pros noticed Libratus taking longer to compute during these rounds and realized that the AI was especially dangerous in the final rounds, but their “bet big early” counter strategy was ineffective.

Third, Libratus ran background computations during each night of the tournament so that it could fix holes in its overall strategy. That meant Libratus was steadily improving its overall level of play and minimizing the ways that its human opponents could exploit its mistakes. It even prioritized fixes based on whether or not its human opponents had noticed and exploited those holes. By comparison, the human poker pros were able to consistently exploit strategic holes in the 2015 tournament against the predecessor AI called Claudico.

... The Libratus victory translates into an astounding winning rate of 14.7 big blinds per 100 hands in poker parlance—and that’s a very impressive winning rate indeed considering the AI was playing four human poker pros. Prior to the start of the tournament, online betting sites had been giving odds of 4:1 with Libratus seen as the underdog.
Here's a recent paper on deep learning and poker. The program DeepStack is not Libratus (thanks to a commenter for pointing this out), but both have managed to outperform human players.
DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker

https://arxiv.org/abs/1701.01724

Artificial intelligence has seen a number of breakthroughs in recent years, with games often serving as significant milestones. A common feature of games with these successes is that they involve information symmetry among the players, where all players have identical information. This property of perfect information, though, is far more common in games than in real-world problems. Poker is the quintessential game of imperfect information, and it has been a longstanding challenge problem in artificial intelligence. In this paper we introduce DeepStack, a new algorithm for imperfect information settings such as poker. It combines recursive reasoning to handle information asymmetry, decomposition to focus computation on the relevant decision, and a form of intuition about arbitrary poker situations that is automatically learned from self-play games using deep learning. In a study involving dozens of participants and 44,000 hands of poker, DeepStack becomes the first computer program to beat professional poker players in heads-up no-limit Texas hold'em. Furthermore, we show this approach dramatically reduces worst-case exploitability compared to the abstraction paradigm that has been favored for over a decade.

No comments:

Blog Archive

Labels