Researchers working in technical disciplines wishing to search for information related to a particular mathematical expression cannot effectively do so with a text-based search engine unless they know appropriate text keywords. To overcome this difficulty, we demonstrate a math-aware search engine, which extends the capability of existing text search engines to search mathematical content.Our search engine is composed of a MathFind processing layer implemented on top of a typical text-based search engine layer. Our prototype piggybacks upon the Apache Lucene Search API, a modified vector space model-based text retrieval system.The MathFind layer of the search engine analyzes expressions in MathML, an XML standard for representing mathematical notation. The process decomposes the mathematical expression into a sequence of text-encoded math fragments. These math fragments are analogous to words in a text document. Math fragments combined with text content serve as input to the textsearch engine. At query time, a graphical equation editor is used to enter a math query, which is internally represented in MathML. The math-processing layer converts the MathML query into a sequence of text-encoded math query terms, which form the basis of a text query performed by the underlying text-search engine. To overcome the ambiguity in the presentation of an expression, MathML input is normalized before processing. [1,2] The current implementation has the following features • Indexes variety of document formats: text + MathML, XHTML + MathML, DocBook + MathML, and via conversion, LaTeX, MS Word, and Mathematica notebooks.• The search engine retrieves ranked documents based on similarity to both math and text queries.• The system is capable of interpreting wild card queries in math expressions analogous to text wild queries.• Math query terms can be highlighted in the retrieved documents (cached) to help users locate matched expressions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.