2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021
DOI: 10.1109/cvpr46437.2021.00730
|View full text |Cite
|
Sign up to set email alerts
|

Dictionary-guided Scene Text Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 53 publications
(21 citation statements)
references
References 17 publications
0
21
0
Order By: Relevance
“…In this section, we experimentally validate our proposed TANGER by comparing the performance with the state-of-theart methods on several public datasets as well as one newly collected multilingual dataset TsiText. First, we examine the performance of TANGER for multilingual scene text recognition in comparison with two end-to-end methods [10,22] and one dictionary-guided method [34]. Then, we compare our model with the vision transformer ViTSTR [5] in three variants, i.e., tiny, small, and base versions for monolingual scene text recognition.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…In this section, we experimentally validate our proposed TANGER by comparing the performance with the state-of-theart methods on several public datasets as well as one newly collected multilingual dataset TsiText. First, we examine the performance of TANGER for multilingual scene text recognition in comparison with two end-to-end methods [10,22] and one dictionary-guided method [34]. Then, we compare our model with the vision transformer ViTSTR [5] in three variants, i.e., tiny, small, and base versions for monolingual scene text recognition.…”
Section: Resultsmentioning
confidence: 99%
“…Table I lists the performance of the proposed TANGER with three state-of-the-art algorithms for multilingual scene text recognition, namely ABCNet+D [34], E2E-MLT [10], and Multiplexed [22]. ABCNet+D [34] is a dictionary-based recognition algorithm that incorporates dictionaries before handling ambiguous cases.…”
Section: B Performance On Multilingual Scene Textsmentioning
confidence: 99%
See 1 more Smart Citation
“…Liu et al [48] perform text recognition using feature pyramids. We demonstrate that, without any task-specific engineering, we reconstruct fine details to perform robustly in dark, noisy conditions on SOTA text recognition methods [20,19,26,4,58], such as PARSeq [7].…”
Section: Related Workmentioning
confidence: 99%
“…As a result, they continuously refine their visual module to obtain increasingly robust visual features and boost their recognition accuracy. Language-based methods 15 18 require an additional module to learn linguistic information that can assist with text prediction. This leads to a better performance than working without a language module.…”
Section: Introductionmentioning
confidence: 99%