A Hybrid Scene Text Script Identification Network for Regional Indian Languages

Naosekpam, Veronica; Sahu, Nilkanta

doi:10.1145/3649439

ACM Trans. Asian Low-Resour. Lang. Inf. Process.

2024

DOI: 10.1145/3649439

|View full text |Cite

A Hybrid Scene Text Script Identification Network for Regional Indian Languages

Veronica Naosekpam,

Nilkanta Sahu

Abstract: In this work, we introduce WAFFNet, an attention-centric feature fusion architecture tailored for word-level multi-lingual scene text script identification. Motivated by the limitations of traditional approaches that rely exclusively on feature-based methods or deep learning strategies, our approach amalgamates statistical and deep features to bridge the gap. At the core of WAFFNet, we utilized the merits of Local Binary Pattern —a prominent descriptor capturing low-level texture features with high-dimensional… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article2

Relationship

Self Cite1

Independent1

Authors

Journals

Cited by 2 publications

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Video text rediscovery: Predicting and tracking text across complex scenes

Naosekpam,

Sahu

2024

Computational Intelligence

Self Cite

View full text Add to dashboard Cite

Dynamic texts in scene videos provide valuable insights and semantic cues crucial for video applications. However, the movement of this text presents unique challenges, such as blur, shifts, and blockages. While efficient in tracking text, state‐of‐the‐art systems often need help when text becomes obscured or complicated scenes. This study introduces a novel method for detecting and tracking video text, specifically designed to predict the location of obscured or occluded text in subsequent frames using a tracking‐by‐detection paradigm. Our approach begins with a primary detector to identify text within individual frames, thus enhancing tracking accuracy. Using the Kalman filter, Munkres algorithm, and deep visual features, we establish connections between text instances across frames. Our technique works on the concept that when text goes missing in a frame due to obstructions, we use its previous speed and location to predict its next position. Experiments conducted on the ICDAR2013 Video and ICDAR2015 Video datasets confirm our method's efficacy, matching or surpassing established methods in performance.

show abstract

Video text rediscovery: Predicting and tracking text across complex scenes

Naosekpam,

Sahu

2024

Computational Intelligence

Self Cite

View full text Add to dashboard Cite

show abstract

A Novel Framework for Multilingual Script Detection and Pattern Analysis in Mixed Script Queries

Chaudhary,

Pradhan,

Shekhar

2024

IJERR

View full text Add to dashboard Cite

A script detection system that is capable of handling several languages is becoming more necessary in today's world. The task of identifying scripts written in various languages has been substantially facilitated by the use of machine learning and deep learning, respectively. Machine learning techniques have used the Naive Bayes and Support Vector Machines (SVM) mechanism for the purpose of language detection. On the other hand, this paper reviews several unique deep-learning processes that have considered a range of methodologies, including LSTM and Bert. On the other hand, it has been shown that there is a need to improve the accuracy and the scalability often incorporated in multilingual systems. As a consequence of this, the primary focus of the present investigation is on the development of an innovative framework that is capable of recognizing scripts in a variety of languages. In addition, this technique considers pattern analysis while considering mixed script queries. A scalable, efficient, and adaptive approach has been established via study to increase the accuracy of the identification of a large number of languages. Accuracy, recall, and F1-score are some of the performance metrics that have been calculated in order to evaluate the efficacy of the multilingual script identification that has been presented. In conclusion, it has been found that the approach that was provided has supplied a solution that is both efficient and scalable for the detection of multilingual scripts.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

A Hybrid Scene Text Script Identification Network for Regional Indian Languages

Cited by 2 publications

References 29 publications

Video text rediscovery: Predicting and tracking text across complex scenes

Video text rediscovery: Predicting and tracking text across complex scenes

A Novel Framework for Multilingual Script Detection and Pattern Analysis in Mixed Script Queries

Contact Info

Product

Resources

About