2019
DOI: 10.1016/j.neucom.2019.01.094
|View full text |Cite
|
Sign up to set email alerts
|

Reading scene text with fully convolutional sequence modeling

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
51
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 79 publications
(51 citation statements)
references
References 28 publications
0
51
0
Order By: Relevance
“…Among them, the first group contains several well-known recognition networks, including CRNN [1] and GRCNN [5]. We then compare ours with previous attention aware approaches such as FAN [8], FCN [14], and Baek et al [16].…”
Section: Configurationmentioning
confidence: 99%
“…Among them, the first group contains several well-known recognition networks, including CRNN [1] and GRCNN [5]. We then compare ours with previous attention aware approaches such as FAN [8], FCN [14], and Baek et al [16].…”
Section: Configurationmentioning
confidence: 99%
“…Lee et al [17] combined a recursive CNN with a recurrent CNN in their R 2 AM to capture long-term dependencies when extracting features from raw images, and then fed these features to an attention-RNN network for sequential transcription. Gao et al [4,12] designed two models to compare the performance of CNN and LSTM in terms of sequential feature encoding. According to their experiments, features extracted by LSTM were more powerful than those extracted by CNN.…”
Section: Related Workmentioning
confidence: 99%
“…However, with the development of LSTM-based recognizers, the DCNN ones were quickly and significantly surpassed. Recently, some researchers argued that LSTM-based models were hard to train [12] and not able to achieve good performance on non-horizontal text [10], so explorations on models without LSTM started again. For instance, STN-OCR [9] utilized fully connected layers and a fixed number of softmax classifiers for sequential prediction; SqueezedText [20] employed a binary convolutional encoder-decoder network to generate salience maps for individual characters and then exploited a GRU-based bi-RNN for further correction; Liao et al [10] proposed to address the scene text recognition issue from a 2-D perspective with a CA-FCN model, so that the spatial information could be taken into account when performing prediction.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations