A research on Video text tracking and recognition

Wang, Baokang; Liu, Changsong; Ding, Xiaoqing

doi:10.1117/12.2009441

Cited by 7 publications

(2 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These tactics are referred to as tracking-based detection strategies. Generalized temporalspatial information [74] and fusion techniques [75] can classify these methods. In the first technique, the noise is removed directly by using temporal or spatial information.…”

Section: F Tracking Based Detectionmentioning

confidence: 99%

ETDR: An Exploratory View of Text Detection and Recognition in Images and Videos

Lokkondra¹,

Ramegowda²,

Thimmaiah³

et al. 2021

RIA

View full text Add to dashboard Cite

Images and videos with text content are a direct source of information. Today, there is a high need for image and video data that can be intelligently analyzed. A growing number of researchers are focusing on text identification, making it a hot issue in machine vision research. Since this opens the way, several real-time-based applications such as text detection, localization, and tracking have become more prevalent in text analysis systems. To find out more about how text information may be extracted, have a look at our survey. This study presents a trustworthy dataset for text identification in images and videos at first. The second part of the article details the numerous text formats, both in images and video. Third, the process flow for extracting information from the text and the existing machine learning and deep learning techniques used to train the model was described. Fourth, explain assessment measures that are used to validate the model. Finally, it integrates the uses and difficulties of text extraction across a wide range of fields. Difficulties focus on the most frequent challenges faced in the actual world, such as capturing techniques, lightning, and environmental conditions. Images and videos have evolved into valuable sources of data. The text inside the images and video provides a massive quantity of facts and statistics. However, such data is not easy to access. This exploratory view provides easier and more accurate mathematical modeling and evaluation techniques to retrieve the text in image and video into an accessible form.

show abstract

Section: F Tracking Based Detectionmentioning

confidence: 99%

ETDR: An Exploratory View of Text Detection and Recognition in Images and Videos

Lokkondra¹,

Ramegowda²,

Thimmaiah³

et al. 2021

RIA

View full text Add to dashboard Cite

show abstract

“…This is because, most of the non-horizontal text lines are scene text, which is much more difficult to detect due to varying lighting and complex transformations [2], [1]. In addition, most of the proposed methods discussed the experimental works on static images, but not on the video frames [17]. In this context, we propose a video based text localization system that takes into account, shot detection from a video, key frame identification from the shots followed by text localization in each key frames.…”

Section: Introductionmentioning

confidence: 99%

Discrete Wavelet Transform and Gradient Difference Based Approach for Text Localization in Videos

Shekar

Smitha

Shivakumara

2014

2014 Fifth International Conference on Signal and Image Processing

View full text Add to dashboard Cite

Abstract-The text detection and localization is important for video analysis and understanding. The scene text in video contains semantic information and thus can contribute significantly to video retrieval and understanding. However, most of the approaches detect scene text in still images or single video frame. Videos differ from images in temporal redundancy. This paper proposes a novel hybrid method to robustly localize the texts in natural scene images and videos based on fusion of discrete wavelet transform and gradient difference. A set of rules and geometric properties have been devised to localize the actual text regions. Then, morphological operation is performed to generate the text regions and finally the connected component analysis is employed to localize the text in a video frame. The experimental results obtained on publicly available standard ICDAR 2003 and Hua dataset illustrate that the proposed method can accurately detect and localize texts of various sizes, fonts and colors. The experimentation on huge collection of video databases reveal the suitability of the proposed method to video databases.

show abstract