Proceedings of the 2019 ACM Conference on Economics and Computation 2019
DOI: 10.1145/3328526.3329634
|View full text |Cite
|
Sign up to set email alerts
|

Iterated Deep Reinforcement Learning in Games

Abstract: Deep reinforcement learning (RL) is a powerful method for generating policies in complex environments, and recent breakthroughs in game-playing have leveraged deep RL as part of an iterative multiagent search process. We build on such developments and present an approach that learns progressively better mixed strategies in complex dynamic games of imperfect information, through iterated use of empirical game-theoretic analysis (EGTA) with deep RL policies. We apply the approach to a challenging cybersecurity g… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(6 citation statements)
references
References 24 publications
0
6
0
Order By: Relevance
“…The input image size is 480 × 480, and the maximum sequence length is 500. Synchronized BN [4] and Ranger optimizer [5] are apply in this experiment, and the initial learning rate of optimizer is 0.001 with step learning rate decay.…”
Section: Methodsmentioning
confidence: 99%
“…The input image size is 480 × 480, and the maximum sequence length is 500. Synchronized BN [4] and Ranger optimizer [5] are apply in this experiment, and the initial learning rate of optimizer is 0.001 with step learning rate decay.…”
Section: Methodsmentioning
confidence: 99%
“…Ranger [16] is a synergistic optimizer combining RAdam (Rectified Adam) [22], LookAhead [23], and GC (gradient centralization) [24].…”
Section: Ablation Studymentioning
confidence: 99%
“…In addition, we leveraged the feature pyramid network (FPN) [11], the large input resolution, and the deformable convolutional network (DCN) [12] because these techniques could benefit large formula detection or small formula detection, or both. Finally, some other tricks, such as ResNeSt [13], SyncBN [14], the large batch size and the weighted box fusion (WBF) [15], Ranger [16] optimizer, were adopted in our solution.…”
Section: Introductionmentioning
confidence: 99%
“…Ranger Optimizer. Ranger [3] integrates RAdam (Rectified Adam) [4], LookAhead [5], and GC (gradient centralization) [6] into one optimizer. LookAhead can be considered as an extension of Stochastic Weight Averaging (SWA) [7] in the training stage.…”
Section: Task1: Table Structure Reconstructionmentioning
confidence: 99%
“…The maximum sequence length is 500. In default, Synchronized BN [10] and Ranger optimizer [3] are used in our experiments. The initial learning rate of optimizer is 0.001 with step learning rate decay.…”
Section: Implementation Detailsmentioning
confidence: 99%