1999
DOI: 10.1007/s005300050140
|View full text |Cite
|
Sign up to set email alerts
|

Video OCR: indexing digital news libraries by recognition of superimposed captions

Abstract: The automatic extraction and recognition of news captions and annotations can be of great help locating topics of interest in digital news video libraries. To achieve this goal, we present a technique, called Video OCR (Optical Character Reader), which detects, extracts, and reads text areas in digital video data. In this paper, we address problems, describe the method by which Video OCR operates, and suggest applications for its use in digital news archives. To solve two problems of character recognition for … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
90
0

Year Published

1999
1999
2012
2012

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 149 publications
(91 citation statements)
references
References 16 publications
1
90
0
Order By: Relevance
“…The text detection methods based on this characteristic assume that text regions have uniform colors and satisfy certain constraints on size, shape, and spatial layout. The second is the texture alike characteristic of the text regions [65,33,66,67]. The text detection methods based on texture information usually assume that the text regions have special texture patterns.…”
Section: Related Work On Video Text Detection and Trackingmentioning
confidence: 99%
“…The text detection methods based on this characteristic assume that text regions have uniform colors and satisfy certain constraints on size, shape, and spatial layout. The second is the texture alike characteristic of the text regions [65,33,66,67]. The text detection methods based on texture information usually assume that the text regions have special texture patterns.…”
Section: Related Work On Video Text Detection and Trackingmentioning
confidence: 99%
“…Additional locations are mapped from "organization" terms/phrases with self-contained locations, such as "Capitol Hill", using a manually created mapping list. Note that location terms are sometimes superimposed on the video frames, which can be recognized by video optical character recognition (VOCR) techniques [7]. However, the VOCR output tends to be errorful on low-resolution news video, and they offer few distinct locations since most of them overlap with those from transcript.…”
Section: Extracting Candidate Locationsmentioning
confidence: 99%
“…While we choose not to rely on the errorful locations recognized by VOCR [7] (Section 2), they are nevertheless useful due to their similarity to the true location terms. In Fig.4, for example, Iraq is recognized as Lraq, differing by only one character.…”
Section: Screen-overlaid Location (Vocr)mentioning
confidence: 99%
“…(18). This verification scheme removed 7255 regions from the 7537 false alarm regions while only reject 23 true text lines, which gives a 99.76% RRR and a 97% RPR as listed in Table 4.…”
Section: Evaluation Of the Text Verificationmentioning
confidence: 99%
“…Temporal information is usually helpful for detecting captions since they are mostly stationary but not very helpful for the scene text, which may have various motion and transforms. Sato [18] and Lienhart [12] computed the maximum or minimum value at each pixel position over frames. The values of the background pixels that are assumed to have more variance through video sequence will be pushed to black or white while the values of the text pixels are kept.…”
Section: Introductionmentioning
confidence: 99%