OpenHoldem: A Benchmark for Large-Scale Imperfect-Information Game Research

Li, Kai; Xu, Huimin; Zhao, Enmin; Wu, Zhe

doi:10.48550/arxiv.2012.06168

Cited by 1 publication

(1 citation statement)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The first experiment proved our method's feasibility in NLTH games against a weak opponent. In the next experiment, we introduced four stronger agents adopted from an open-source NLTH platform [30]. They were designed according to specific rules and characterized by human-like styles, namely Tight Aggressive (TA), Tight Passive (TP), Loose Aggressive (LA), and Loose Passive (LP).…”

Section: Learning To Exploit Baseline Opponentsmentioning

confidence: 99%

Efficient Opponent Exploitation in No-Limit Texas Hold’em Poker: A Neuroevolutionary Method Combined with Reinforcement Learning

Chen

2021

Electronics

View full text Add to dashboard Cite

In the development of artificial intelligence (AI), games have often served as benchmarks to promote remarkable breakthroughs in models and algorithms. No-limit Texas Hold’em (NLTH) is one of the most popular and challenging poker games. Despite numerous studies having been conducted on this subject, there are still some important problems that remain to be solved, such as opponent exploitation, which means to adaptively and effectively exploit specific opponent strategies; this is acknowledged as a vital issue especially in NLTH and many real-world scenarios. Previous researchers tried to use an off-policy reinforcement learning (RL) method to train agents that directly learn from historical strategy interactions but suffered from challenges of sparse rewards. Other researchers instead adopted neuroevolutionary (NE) method to replace RL for policy parameter updates but suffered from high sample complexity due to the large-scale problem of NLTH. In this work, we propose NE_RL, a novel method combing NE with RL for opponent exploitation in NLTH. Our method contains a hybrid framework that uses NE’s advantage of evolutionary computation with a long-term fitness metric to address the sparse rewards feedback in NLTH and retains RL’s gradient-based method for higher learning efficiency. Experimental results against multiple baseline opponents have proved the feasibility of our method with significant improvement compared to previous methods. We hope this paper provides an effective new approach for opponent exploitation in NLTH and other large-scale imperfect information games.

show abstract

Section: Learning To Exploit Baseline Opponentsmentioning

confidence: 99%

Efficient Opponent Exploitation in No-Limit Texas Hold’em Poker: A Neuroevolutionary Method Combined with Reinforcement Learning

Chen

2021

Electronics

View full text Add to dashboard Cite

show abstract

OpenHoldem: A Benchmark for Large-Scale Imperfect-Information Game Research

Cited by 1 publication

References 20 publications

Efficient Opponent Exploitation in No-Limit Texas Hold’em Poker: A Neuroevolutionary Method Combined with Reinforcement Learning

Efficient Opponent Exploitation in No-Limit Texas Hold’em Poker: A Neuroevolutionary Method Combined with Reinforcement Learning

Contact Info

Product

Resources

About