Proceedings of the 24th ACM International Conference on Multimedia 2016
DOI: 10.1145/2964284.2967282
|View full text |Cite
|
Sign up to set email alerts
|

Detecting Arbitrary Oriented Text in the Wild with a Visual Attention Model

Abstract: Text embedded in images provides important semantic information about a scene and its content. Detecting text in an unconstrained environment is a challenging task because of the many fonts, sizes, backgrounds, and alignments of the characters. We present a novel attention model for detecting arbitrary oriented and curved scene text. Inspired by the attention mechanisms in the human visual system, our model utilizes a spatial glimpse network to processes the attended area and deploys a recurrent neural network… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
4
2
1

Relationship

3
4

Authors

Journals

citations
Cited by 13 publications
(5 citation statements)
references
References 25 publications
0
5
0
Order By: Relevance
“…As with other DL models, however, this work does not assume the forcings or the target observations to be perfect. The Artificial Intelligence community has worked extensively with data “in the wild,” that is, large but low‐quality data sets, and DL models appear to deliver good performance even if there is significant noise (Huang et al., 2016; Izadinia et al., 2015; Stadelmann et al., 2018). What will mislead models are systematic errors, which is what this methodology proposes to improve.…”
Section: Resultsmentioning
confidence: 99%
“…As with other DL models, however, this work does not assume the forcings or the target observations to be perfect. The Artificial Intelligence community has worked extensively with data “in the wild,” that is, large but low‐quality data sets, and DL models appear to deliver good performance even if there is significant noise (Huang et al., 2016; Izadinia et al., 2015; Stadelmann et al., 2018). What will mislead models are systematic errors, which is what this methodology proposes to improve.…”
Section: Resultsmentioning
confidence: 99%
“…Huang et al [26] utilized the recurrent attention model to detect arbitrary oriented text in the wild and achieved state-of-the-art accuracy on ICDAR 2013 and MSRA-TD500. However, their detection pipeline depends on extremal regions to generate initial attention proposals and on CNN classifiers to filter non-text proposals.…”
Section: Visual Attention Modelmentioning
confidence: 99%
“…Volunteer scientists can also be requested for results in an active learning framework (Settles, 2012), i.e., 10 they can be queried for more data for instances that can best reduce the uncertainty of the predictions. Crowd-sourced data have played roles in deep learning research (Huang et al, 2016;Izadinia et al, 2015), even though there are problems related to data quality. An important co-benefits of involving citizen scientists is the education and outreach to the public.…”
Section: Collecting Big Data Through Data Sharing and Citizen Scientistsmentioning
confidence: 99%