2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021
DOI: 10.1109/cvpr46437.2021.00869
|View full text |Cite
|
Sign up to set email alerts
|

TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
44
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 100 publications
(53 citation statements)
references
References 35 publications
0
44
0
Order By: Relevance
“…Although there are many data sets for training and evaluating OCR systems on printed and scene text (see Singh et al for a recent survey [42]), we found that existing collections of handwritten word images to be limited in their variability (See Table 1). To facilitate training and testing of our TSB, we, therefore, collected and annotated ∼135K handwritten English words from 5K images originally hosted publicly on Imgur.com.…”
Section: Imgur5k Handwriting Setmentioning
confidence: 89%
“…Although there are many data sets for training and evaluating OCR systems on printed and scene text (see Singh et al for a recent survey [42]), we found that existing collections of handwritten word images to be limited in their variability (See Table 1). To facilitate training and testing of our TSB, we, therefore, collected and annotated ∼135K handwritten English words from 5K images originally hosted publicly on Imgur.com.…”
Section: Imgur5k Handwriting Setmentioning
confidence: 89%
“…The experimental results show that the proposed method has improvement on accuracy compared to previous text detectors, and achieves state-of-the-art performance on three benchmark datasets. In the future, we are interested in detecting cases "text inside text" via RCNN module and developing an end-to-end [16] scene text spotting system.…”
Section: Discussionmentioning
confidence: 99%
“…Having in mind the idea that the text recognition head might be undertrained due to complex training pipeline, we freeze all layers except layers that are related to the text recognition head and fine-tune the model during 130K iterations. This fine-tuning does not improve results on ICDAR 2015, but improves quality results upto 74.5% End-to-end recognition metric on Total-Text dataset, making quality metrics values on par with Mask TextSpotter v3 fine-tuned on TextOCR dataset (Singh et al (2021)), see Table 5 for details.…”
Section: Methodsmentioning
confidence: 99%
“…TextOCR (Singh et al (2021)) is a recently published arbitrary-shaped scene text detection and recognition dataset consisting of 28,134 images with 900K annotated words.…”
Section: Scene Text Datasetsmentioning
confidence: 99%