2019
DOI: 10.1016/j.patcog.2018.07.034
|View full text |Cite
|
Sign up to set email alerts
|

Script identification in natural scene image and video frames using an attention based Convolutional-LSTM network

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
45
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 123 publications
(45 citation statements)
references
References 25 publications
0
45
0
Order By: Relevance
“… In human visual system, attention is one of the important mechanisms in capturing information from images. Attention mechanism operates in such a way that it not only extracts the essential information from image, but also stores its contextual relation with other components of image [243]. In future, research may be carried out in the direction that preserves the spatial relevance of objects along with their discriminating features at later stages of learning.…”
Section: Future Directionsmentioning
confidence: 99%
“… In human visual system, attention is one of the important mechanisms in capturing information from images. Attention mechanism operates in such a way that it not only extracts the essential information from image, but also stores its contextual relation with other components of image [243]. In future, research may be carried out in the direction that preserves the spatial relevance of objects along with their discriminating features at later stages of learning.…”
Section: Future Directionsmentioning
confidence: 99%
“…The image in Figure 4(a) which has much more Chinese patches than kana was misclassified to Chinese by their model. Bhunia [20] coupled local and global features, but it suffers from the impairment too. What is more, the use of many cropped patches can make considerably redundant computation and memory usage which can influence the efficiency especially in its LSTM module which precludes parallelization.…”
Section: B Resultsmentioning
confidence: 99%
“…As for the problem of arbitrary aspect ratios, recent methods with good performance take densely cropped image patches with fixed size as input [12], [13], [15], [20]. They also employ data augmentation somehow, but they suffered from the following three issues.…”
Section: Introductionmentioning
confidence: 99%
“…They have also used Discrete Wavelet Transform (DWT) to reduce the dimension of the data. In [23], the authors have used a CNN-Long Short-Term Memory (LSTM) based framework with dynamic weighting for script recognition. From each image, patches are extracted which are fed to the CNN-LSTM combination.…”
Section: Related Studymentioning
confidence: 99%