Abstract. We present a short survey of the literature on indexing and retrieval of mathematical knowledge, with pointers to 72 papers and tentative taxonomies of both retrieval problems and recurring techniques.
Purpose Driven Taxonomy of Retrieval ProblemsRetrieval of mathematical knowledge is always presented as the low hanging fruit of Mathematical Knowledge Management, and it has been addressed in several papers by people coming either from the formal methods or from the information retrieval community. The problem being resistant to classical content search techniques [LRG13], it is usually addressed combining a small set of new ideas and techniques that are recurrent in the literature. Despite the amount of work, however, there is not a single solution that is the clearly winning on the others, nor convincing unbiased benchmarks to compare solutions. Some authors like [KK07] also suggest that the community should first better understand the actual needs of mathematicians from an unbiased perspective to improve the MKM technology as a whole. In this paper we collect a hopefully comprehensive bibliography, and we roughly classify the papers according to novel taxonomies both for the problems and the techniques employed. The only other surveys on the same topic are [AZ04], now outdated and focused mostly on (European) research projects that contributed to the topic in the 6th Framework Programme, [ZB12], which covers less literature in much greater detail without attempting a classification, [L13], which is focused on evaluation of mathematics retrieval, and [L10], which is written in Slovak.We begin our discussion with a purpose driven taxonomy made of three different retrieval problems that deal with mathematical knowledge. Each problem is characterised by its own set of expectations and constraints, and adopting a solution to another problem may be infeasible or yield poor results. In the next sections we classify the papers according to an encoding based taxonomy (presentation vs content vs semantics) and to a taxonomy of techniques employed. Finally we point to the rich literature relative to the problem of ranking, and we touch the problem of evaluation of systems. We conclude with some notes on the availability of math retrieval systems.⋆ The final publication is available at http://link.springer.com.