2011
DOI: 10.1117/12.873312
|View full text |Cite
|
Sign up to set email alerts
|

Keyword and image-based retrieval of mathematical expressions

Abstract: Two new methods for retrieving mathematical expressions using conventional keyword search and expression images are presented. An expression-level TF-IDF (term frequency-inverse document frequency) approach is used for keyword search, where queries and indexed expressions are represented by keywords taken from L A T E X strings. TF-IDF is computed at the level of individual expressions rather than documents to increase the precision of matching. The second retrieval technique is a form of Content-Based Image R… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
16
0

Year Published

2011
2011
2023
2023

Publication Types

Select...
5
4

Relationship

3
6

Authors

Journals

citations
Cited by 27 publications
(17 citation statements)
references
References 9 publications
1
16
0
Order By: Relevance
“…Our search queries and document database were the same that Zanibbi and Yuan used in their experiment. 5 We evaluated our results through an online Table 1. Precision-at-k (k = 20) for 10 search queries (collection: 24,479 expressions from 50 L A T E X documents).…”
Section: Experiments and Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Our search queries and document database were the same that Zanibbi and Yuan used in their experiment. 5 We evaluated our results through an online Table 1. Precision-at-k (k = 20) for 10 search queries (collection: 24,479 expressions from 50 L A T E X documents).…”
Section: Experiments and Resultsmentioning
confidence: 99%
“…Zanibbi and Yuan have created a keyword-based MIR system using Lucene that they show to be effective. 5 It uses a vector-space approach to store L A T E X expressions and a term frequency-inverse document frequency (TF-IDF) model to rank search results. We will be using their system in our tests for comparison.…”
Section: Introductionmentioning
confidence: 99%
“…Presentation-based methods only consider the variables, numbers and operators appearing in an expression, with no idea about the semantic meanings behind the tokens [10,19,20,7]. For example, one can treat the expression as an ordinary sentence in NLP after proper preprocessing.…”
Section: Mathematical Searchmentioning
confidence: 99%
“…The contributions of this paper include: 1) to our knowledge, the first system for querying math in technical documents using images of handwritten queries (image-based retrieval of isolated L A T E X expressions has been explored [11]), and 2) an experimental validation of the proposed technique. We summarize the underlying segmentation and image matching algorithms in Sections II and III, provide the indexing and retrieval model in Section IV, present an experiment testing our model with handwritten and image queries in Section V, and conclude in Section VI.…”
Section: Introductionmentioning
confidence: 99%