2017
DOI: 10.48550/arxiv.1709.03272
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Fused Text Segmentation Networks for Multi-oriented Scene Text Detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
23
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 18 publications
(23 citation statements)
references
References 31 publications
0
23
0
Order By: Relevance
“…To the best of our knowledge, this is the best reported result in literature. Similar to [He et al, 2016a], we utilized an ensemble of 5 networks, while the backbones are ResNet101 (2 networks), ResNet50 (2 networks) and VGG (1 network). We used an ensemble of these 5 networks for proposing regions.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…To the best of our knowledge, this is the best reported result in literature. Similar to [He et al, 2016a], we utilized an ensemble of 5 networks, while the backbones are ResNet101 (2 networks), ResNet50 (2 networks) and VGG (1 network). We used an ensemble of these 5 networks for proposing regions.…”
Section: Resultsmentioning
confidence: 99%
“…In Figure 1, the basic feature extraction module is ResNet-50 [He et al, 2016a]. For scene text detection, finer feature information is very important especially for segmentation task, the final downsampling in res stage 5 may lose some useful information.…”
Section: Overviewmentioning
confidence: 99%
“…Usually, for improving the robustness of the model when encountering various orientations of texts, existing methods [20,25,26] make use of data augmentation that rotates source images to different angles to harvest sufficient data for training. Despite the effectiveness of data augmentation, the main drawback lies in learning all the possible transformations of augmented data require more network parameters, and it also may result in significant increase of training cost and over-fitting risk [27].…”
Section: Dual-roi Poolingmentioning
confidence: 99%
“…There are also many methods adopt box proposal based instance segmentation network such as Mask R-CNN [12] for text detection. [4] fuses multiple-layer feature map for RPN and RoI, finally predicts a segmentation map of text. [42] completes both of detection and recognition tasks on mask-branch by predicting word segmentation map and character instance segmentation map, a weighted edit distance measure is proposed to find the best-matched recognition result.…”
Section: Related Workmentioning
confidence: 99%