2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00670
|View full text |Cite
|
Sign up to set email alerts
|

Aggregation Cross-Entropy for Sequence Recognition

Abstract: In this paper, we propose a novel method, aggregation cross-entropy (ACE), for sequence recognition from a brand new perspective. The ACE loss function exhibits competitive performance to CTC and the attention mechanism, with much quicker implementation (as it involves only four fundamental formulas), faster inference\back-propagation (approximately O(1) in parallel), less storage requirement (no parameter and negligible runtime memory), and convenient employment (by replacing CTC with ACE). Furthermore, the p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
56
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 109 publications
(56 citation statements)
references
References 45 publications
0
56
0
Order By: Relevance
“…Although the confusion among the 7360 classes is higher, Table IX shows an overall comparison of our proposed method and other state-of-the-art methods without/with a language model on the ICDAR 2013 competition set. we list the state-of-theart oversegmentation method heterogeneous CNN [7], CNNs-RNNLM [8] and the segmentationfree method SMDLSTM-CTC [15], CNN-ACE [16] in Table IX for comparison. With the same configuration of vocabulary size (4 more garbage classes adopted in our HMM system), the proposed WCNN-PHMM yielded the best performance whether a language model was employed or not.…”
Section: ) Visualization Analysis For Writer Codementioning
confidence: 99%
See 1 more Smart Citation
“…Although the confusion among the 7360 classes is higher, Table IX shows an overall comparison of our proposed method and other state-of-the-art methods without/with a language model on the ICDAR 2013 competition set. we list the state-of-theart oversegmentation method heterogeneous CNN [7], CNNs-RNNLM [8] and the segmentationfree method SMDLSTM-CTC [15], CNN-ACE [16] in Table IX for comparison. With the same configuration of vocabulary size (4 more garbage classes adopted in our HMM system), the proposed WCNN-PHMM yielded the best performance whether a language model was employed or not.…”
Section: ) Visualization Analysis For Writer Codementioning
confidence: 99%
“…In [15], the authors used separable MDLSTM-RNN (SMDLSTM-RNN) with CTC loss, instead of the traditional LSTM-CTC method. More recently, the authors in [16] proposed a novel aggregation cross-entropy loss for sequence recognition, which was shown to exhibit competitive performance for offline HCTR. In [17], we verified that combining hybrid deep CNN-HMM (DCNN-HMM) with a powerful language model could achieve the best reported results of the segmentation-free approaches on the ICDAR 2013 competition dataset.…”
Section: Introductionmentioning
confidence: 99%
“…In practical use, the model prediction is generated by the RNN model. Following the recommendations of Xie et al [47], the overall computation complexity of the proposed method consists of the four computation complexities of O(1), O(|C ε |), O(|C ε |), and O(|C ε |). The computation complexity of the loss function is O(|C ε |).…”
Section: Computational Complexity Analysismentioning
confidence: 99%
“…In recent years, deep learning has achieved tremendous success in fundamental computer vision applications such as image recognition [8,27,30], object detection [6,15,20,22] and image segmentation [7,17]. In light of this, deep learning has also been applied to areas such as text detection [4,34,46] and text recognition [11,13,35,37,40], as well as their downstream tasks such as key information extraction [5,29,39] and named entity recognition [3,36].…”
Section: Introductionmentioning
confidence: 99%