2017 IEEE International Conference on Computer Vision (ICCV) 2017
DOI: 10.1109/iccv.2017.242
|View full text |Cite
|
Sign up to set email alerts
|

Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

3
193
2
1

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 238 publications
(199 citation statements)
references
References 11 publications
3
193
2
1
Order By: Relevance
“…Those approaches can be divided into two major groups: two-stage proposal-driven and onestage proposal-free method. Although two-stage framework [37,24,10] consistently achieves top accuracy on the public benchmarks [12,11,25], recent works [16,22,2,9] based on one-stage frameworks also demonstrate yielding faster text detectors with comparable accuracy.…”
Section: Introductionmentioning
confidence: 94%
See 1 more Smart Citation
“…Those approaches can be divided into two major groups: two-stage proposal-driven and onestage proposal-free method. Although two-stage framework [37,24,10] consistently achieves top accuracy on the public benchmarks [12,11,25], recent works [16,22,2,9] based on one-stage frameworks also demonstrate yielding faster text detectors with comparable accuracy.…”
Section: Introductionmentioning
confidence: 94%
“…In order to achieve multi-oriented text detection, DMPNet [22] added several rotated anchors, for a total of 12 (6 regular and 6 inclined) to find the best match to arbitrary-oriented text instance. Instead of choosing priors by hand, DeepTextSpotter [2] followed YOLOv2 [28] runs k-means clustering (k = 14) on the training set bounding boxes to automatically find suitable priors.…”
Section: Introductionmentioning
confidence: 99%
“…Perspective RoI Transform: Given quadrangle text proposals predicted by the detection branch, we employ the perspective RoI transform [37] to align the corresponding regions from the feature map F into small feature maps F p rather than wrapping proposals in rotated rectangles [8][28] [14]. Each feature map F p is kept in a fixed height with an unchanged aspect ratio.…”
Section: End-to-end Chinese Text Readingmentioning
confidence: 99%
“…Liao et al [14], [15] use a single-shot text detector along with a text recognizer to detect and recognize horizontal and oriented scene text in images respectively. [16], [17] start to integrate the scene text detection and recognition modules into a unified framework. [5] propose an end-to-end trainable scene text spotter enjoying a simple pipeline and can handle text of various shapes.…”
Section: B Differences From Scene Text Spottingmentioning
confidence: 99%