The growth in digital camera usage combined with a worldly abundance of text has translated to a rich new era for a classic problem of pattern recognition, reading. While traditional document processing often faces challenges such as unusual fonts, noise, and unconstrained lexicons, scene text reading amplifies these challenges and introduces new ones such as motion blur, curved layouts, perspective projection, and occlusion among others. Reading scene text is a complex problem involving many details that must be handled effectively for robust, accurate results. In this work, we describe and evaluate a reading system that combines several pieces, using probabilistic methods for coarsely binarizing a given text region, identifying baselines, and jointly performing word and character segmentation during the recognition process. By using scene context to recognize several words together in a line of text, our system gives state-of-the-art performance on three difficult benchmark data sets.
Abstract-This paper presents a system for open-vocabulary text recognition in images of natural scenes. First, we describe a novel technique for text segmentation that models smooth color changes across images. We combine this with a recognition component based on a conditional random field with histogram of oriented gradients descriptors and incorporate language information from a lexicon to improve recognition performance. Many existing techniques for this problem use language information from a standard lexicon, but these may not include many of the words found in images of the environment, such as storefront signs and street signs. We avoid this limitation by incorporating language information from a large web-based lexicon of around 13.5 million words. This lexicon contains words encountered during a crawl of the web, so it is likely to contain proper nouns, like business names and street names. We show that our text segmentation method allows for better recognition performance than the current state-of-the-art text segmentation method. We also evaluate this full system on two standard data sets, ICDAR 2003 and ICDAR 2011, and show an increase in word recognition performance compared to the current state-of-the-art methods.
Abstract-Recognizing text in natural photographs that contain specular highlights and focal blur is a challenging problem. In this paper we describe a new text segmentation method based on inverse rendering, i.e. decomposing an input image into basic rendering elements. Our technique uses iterative optimization to solve the rendering parameters, including light source, material properties (e.g. diffuse/specular reflectance and shininess) as well as blur kernel size. We combine our segmentation method with a recognition component and show that by accounting for the rendering parameters, our approach achieves higher text recognition accuracy than previous work, particularly in the presence of color changes and image blur. In addition, the derived rendering parameters can be used to synthesize new text images that imitate the appearance of an existing image.
Student success, a major focus in higher education, in part, requires students to remain actively engaged in the required coursework. Identifying student disengagement, when a student stops completing coursework, at scale has been a continuing challenge for higher education due to the heterogeneity of traditional college courses. This research uses data from Connect by McGraw-Hill Education, a widely used online learning tool, to build a classifier to identify learning tool disengagement at scale. This classifier was trained and tested on four years of historical data, representing 4.5 million students in 175,000 courses, across 256 disciplines. Results show that the classifier is effective in identifying disengagement within the online learning tool against baselines, across time, and within and across disciplines. The classifier was also effective in identifying students at risk of disengaging from Connect and then earning unsuccessful grades in a pilot course for which the assignments in Connect were worth a relatively small portion of the overall course grade. Because Connect is widely used, this classifier is positioned to be a good tool for instructors and institutions to identify students at risk for disengagement from coursework. Instructors and institutions can use this information to design and implement interventions to improve engagement and improve student success at the institution in key courses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.