Iterated Deep Reinforcement Learning in Games

Wright, Mason; Wang, Yongzhao; Wellman, Michael P.

doi:10.1145/3328526.3329634

Cited by 15 publications

(6 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The input image size is 480 × 480, and the maximum sequence length is 500. Synchronized BN [4] and Ranger optimizer [5] are apply in this experiment, and the initial learning rate of optimizer is 0.001 with step learning rate decay.…”

Section: Methodsmentioning

confidence: 99%

PingAn-VCGroup's Solution for ICDAR 2021 Competition on Scientific Literature Parsing Task B: Table Recognition to HTML

Ye,

Qi,

et al. 2021

Preprint

View full text Add to dashboard Cite

This paper presents our solution for ICDAR 2021 competition on scientific literature parsing task B: table recognition to HTML. In our method, we divide the table content recognition task into four sub-tasks: table structure recognition, text line detection, text line recognition, and box assignment. Our table structure recognition algorithm is customized based on MASTER [1], a robust image text recognition algorithm. PSENet [2] is used to detect each text line in the table image. For text line recognition, our model is also built on MASTER. Finally, in the box assignment phase, we associated the text boxes detected by PSENet with the structure item reconstructed by table structure prediction, and fill the recognized content of the text line into the corresponding item. Our proposed method achieves a 96.84% TEDS score on 9,115 validation samples in the development phase, and a 96.32% TEDS score on 9,064 samples in the final evaluation phase.

show abstract

Section: Methodsmentioning

confidence: 99%

PingAn-VCGroup's Solution for ICDAR 2021 Competition on Scientific Literature Parsing Task B: Table Recognition to HTML

Ye,

Qi,

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Ranger [16] is a synergistic optimizer combining RAdam (Rectified Adam) [22], LookAhead [23], and GC (gradient centralization) [24].…”

Section: Ablation Studymentioning

confidence: 99%

“…In addition, we leveraged the feature pyramid network (FPN) [11], the large input resolution, and the deformable convolutional network (DCN) [12] because these techniques could benefit large formula detection or small formula detection, or both. Finally, some other tricks, such as ResNeSt [13], SyncBN [14], the large batch size and the weighted box fusion (WBF) [15], Ranger [16] optimizer, were adopted in our solution.…”

Section: Introductionmentioning

confidence: 99%

1st Place Solution for ICDAR 2021 Competition on Mathematical Formula Detection

Zhong,

Qi,

et al. 2021

Preprint

View full text Add to dashboard Cite

In this technical report, we present our 1st place solution for the ICDAR 2021 competition on mathematical formula detection (MFD) 2 . The MFD task has three key challenges, i.e. a large scale span, large variation of the ratio between height and width, and rich character set and mathematical expressions. Considering these challenges, we used Generalized Focal Loss (GFL), an anchor-free method, instead of the anchor-based method, and prove the Adaptive Training Sampling Strategy (ATSS) and proper Feature Pyramid Network (FPN) can well solve the important issue of scale variation. Meanwhile, we also found some tricks, e.g., Deformable Convolution Network (DCN), SyncBN, and Weighted Box Fusion (WBF), were effective in MFD task. Our method was ranked 1st place in the final 15 teams.

show abstract

“…Ranger Optimizer. Ranger [3] integrates RAdam (Rectified Adam) [4], LookAhead [5], and GC (gradient centralization) [6] into one optimizer. LookAhead can be considered as an extension of Stochastic Weight Averaging (SWA) [7] in the training stage.…”

Section: Task1: Table Structure Reconstructionmentioning

confidence: 99%

“…The maximum sequence length is 500. In default, Synchronized BN [10] and Ranger optimizer [3] are used in our experiments. The initial learning rate of optimizer is 0.001 with step learning rate decay.…”

Section: Implementation Detailsmentioning

confidence: 99%

PingAn-VCGroup's Solution for ICDAR 2021 Competition on Scientific Table Image Recognition to Latex

He¹,

Qi²,

Ye³

et al. 2021

Preprint

View full text Add to dashboard Cite

This paper presents our solution for the ICDAR 2021 Competition on Scientific Table Image Recognition to LaTeX. This competition has two sub-tasks: Table Structure Reconstruction (TSR) and Table Content Reconstruction (TCR). We treat both sub-tasks as two individual image-to-sequence recognition problems. We leverage our previously proposed algorithm MASTER [1], which is originally proposed for scene text recognition. We optimize the MASTER model from several perspectives: network structure, optimizer, normalization method, pre-trained model, resolution of input image, data augmentation, and model ensemble. Our method achieves 0.7444 Exact Match and 0.8765 Exact Match @95% on the TSR task, and obtains 0.5586 Exact Match and 0.7386 Exact Match 95% on the TCR task.

show abstract

Iterated Deep Reinforcement Learning in Games

Cited by 15 publications

References 24 publications

PingAn-VCGroup's Solution for ICDAR 2021 Competition on Scientific Literature Parsing Task B: Table Recognition to HTML

PingAn-VCGroup's Solution for ICDAR 2021 Competition on Scientific Literature Parsing Task B: Table Recognition to HTML

1st Place Solution for ICDAR 2021 Competition on Mathematical Formula Detection

PingAn-VCGroup's Solution for ICDAR 2021 Competition on Scientific Table Image Recognition to Latex

Contact Info

Product

Resources

About