Overlay text extraction from TV news broadcast

Kannao, Raghvendra; Guha, Prithwijit

doi:10.1109/indicon.2015.7443440

Cited by 3 publications

(4 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Inspired by this approach, we also generate part of our training data synthetically, however, we use the resulting dataset to improve the performance of several of our system's mod-ules and not a Tesseract engine. Furthermore, contrary to the results presented in [7], our text recognition engine that relies on a convolutional recurrent neural network architecture [1] significantly outperforms the competing methods, including the baseline Tesseract method.…”

Section: Related Workcontrasting

confidence: 85%

“…Detecting and recognizing blocks of text in videos has also gained significant attention from the research community [5,6,7]. In [5], Sato et al present an approach based on extracting and classifying hand-crafted features using a computer vision method.…”

Section: Related Workmentioning

confidence: 99%

“…The problem of video overlay extraction is also tackled in [7], where Kannao and Guha propose to detect entire lines of text instead of single words. To decrease the computational cost of detection and recognition, they use temporal tracking across multiple frames.…”

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Extracting Textual Overlays from Social Media Videos Using Neural Networks

Słucki

Trzciński

Bielski³

et al. 2018

Computer Vision and Graphics

View full text Add to dashboard Cite

Textual overlays are often used in social media videos as people who watch them without the sound would otherwise miss essential information conveyed in the audio stream. This is why extraction of those overlays can serve as an important meta-data source, e.g. for content classification or retrieval tasks. In this work, we present a robust method for extracting textual overlays from videos that builds up on multiple neural network architectures. The proposed solution relies on several processing steps: keyframe extraction, text detection and text recognition. The main component of our system, i.e. the text recognition module, is inspired by a convolutional recurrent neural network architecture and we improve its performance using synthetically generated dataset of over 600,000 images with text prepared by authors specifically for this task. We also develop a filtering method that reduces the amount of overlapping text phrases using Levenshtein distance and further boosts system's performance. The final accuracy of our solution reaches over 80% and is au pair with state-of-the-art methods.

show abstract

Section: Related Workcontrasting

confidence: 85%

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Extracting Textual Overlays from Social Media Videos Using Neural Networks

Słucki

Trzciński

Bielski³

et al. 2018

Computer Vision and Graphics

View full text Add to dashboard Cite

show abstract

“…The recent trends for parallel computing, advances in real-time image processing, machine learning and artificial intelligence, embedded and hardware solutions, make possible the design of systems for real-time video analysis. This could offer a wide range of applications for TV industry and users as the video-based soccer analysis [1] or the advertising, logo and text detection [2][3][4].…”

Section: Introductionmentioning

confidence: 99%