Abstract-We describe a robust and efficient system for recognizing typeset and handwritten mathematical notation. From a list of symbols with bounding boxes the system analyzes an expression in three successive passes. The Layout Pass constructs a Baseline Structure Tree (BST) describing the two-dimensional arrangement of input symbols. Reading order and operator dominance are used to allow efficient recognition of symbol layout even when symbols deviate greatly from their ideal positions. Next, the Lexical Pass produces a Lexed BST from the initial BST by grouping tokens comprised of multiple input symbols; these include decimal numbers, function names, and symbols comprised of nonoverlapping primitives such as "=". The Lexical Pass also labels vertical structures such as fractions and accents. The Lexed BST is translated into L A T E X. Additional processing, necessary for producing output for symbolic algebra systems, is carried out in the Expression Analysis Pass. The Lexed BST is translated into an Operator Tree, which describes the order and scope of operations in the input expression. The tree manipulations used in each pass are represented compactly using tree transformations. The compiler-like architecture of the system allows robust handling of unexpected input, increases the scalability of the system, and provides the groundwork for handling dialects of mathematical notation.
Document recognition and retrieval technologies complement one another, providing improved access to increasingly large document collections. While recognition and retrieval of textual information is fairly mature, with wide-spread availability of Optical Character Recognition (OCR) and text-based search engines, recognition and retrieval of graphics such as images, figures, tables, diagrams, and mathematical expressions are in comparatively early stages of research. This paper surveys the state of the art in recognition and retrieval of mathematical expressions, organized around four key problems in math retrieval (query construction, normalization, indexing, and relevance feedback), and four key problems in math recognition (detecting expressions, detecting and classifying symbols, analyzing symbol layout, and constructing a representation of meaning). Of special interest is the machine learning problem of jointly optimizing the component algorithms in a math recognition system, and developing effective indexing, retrieval and relevance feedback algorithms for math retrieval. Another important open problem is developing user interfaces that seamlessly integrate recognition and retrieval. Activity in these important research areas is increasing, in part because math notation provides an excellent domain for studying problems common to many document and graphics recognition and retrieval applications, and also because mature applications will likely provide substantial benefits for education, research, and mathematical literacy.
We present the outcome of the latest edition of the CROHME competition, dedicated to on-line handwritten mathematical expression recognition. In addition to the standard full expression recognition task from previous competitions, CROHME 2014 features two new tasks. The first is dedicated to isolated symbol recognition including a reject option for invalid symbol hypotheses, and the second concerns recognizing expressions that contain matrices. System performance is improving relative to previous competitions. Data and evaluation tools used for the competition are publicly available.
We present a technique for using content-based video labeling as a CAPTCHA task. Our CAPTCHAs are generated from YouTube videos, which contain labels (tags) supplied by the person that uploaded the video. They are graded using a video's tags, as well as tags from related videos. In a user study involving 184 participants, we were able to increase the human success rate on our video CAPTCHA from roughly 70% to 90%, while keeping the success rate of a tag frequency-based attack fixed at around 13%. Through a different parameterization of the challenge generation and grading algorithms, we were able to reduce the success rate of the same attack to 2%, while still increasing the human success rate from 70% to 75%. The usability and security of our video CAPTCHA appears to be comparable to existing CAPTCHAs, and a majority of participants (60%) indicated that they found the video CAPTCHAs more enjoyable than traditional CAPTCHAs in which distorted text must be transcribed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.