Algorithms for learning in simple and complex games

DeepStack bridges the gap between AI techniques for games of perfect information—like checkers, chess and Go—with ones for imperfect information games–like poker–to reason while it plays using “intuition” honed through deep learning to reassess its strategy with each decision.

DeepStack was introduced in the work DeepStack: Expert-level artificial intelligence in heads-up no-limit poker published in Science in March 2017. It became the first algorithm being able to beat also professional players of heads-up no-limit Texas hold'em poker.

Viliam Lisý (FEL, ČVUT) - one of the co-authors of the mentioned paper - will give a series of four lectures on game theory, machine learning, and poker.

Outline:The goal of this series of lectures is to get form the very basics of game theory and machine learning all the way to solid understanding of the algorithm used in DeepStack. We will explain how games can be formally modeled, what are meaningful definitions of optimal strategies in games and what is Nash equilibrium in particular. Afterwards, we will focus on simple learning mechanisms in repeated decision-making problems called multi-armed bandit problems. We will show basic properties of learning in these models and then investigate what happens if these algorithms are run against each other in a game. This will form the bases of an algorithm for computing Nash equilibria in simple zero-sum games, which can be extended to Counterfactual Regret Minimization (CFR) in extensive form games. Next, we will explain why it is complicated to decompose extensive form game to independent parts and how CFR-D can solve this problem under certain conditions. Finally, we will briefly introduce deep neural networks and combine all the introduced components to the first algorithm that was able to beat professional poker players.

When:

Monday September 24., 2018: 13:30-15:00 and 16:00-17:30

Tuesday September 25., 2018: 10:00-11:30 and 13:00-14:30

Kde:

FJFI, ČVUT, Trojanova 13 - T212