Motivation
Figures in biomedical papers communicate essential information with the potential to identify relevant documents in biomedical and clinical settings. However, academic search interfaces mainly search over text fields.
Results
We describe a search system for biomedical documents that leverages image modalities and an existing index server. We integrate a problem-specific taxonomy of image modalities and image-based data into a custom search system. Our solution features a front-end interface to enhance classical document search results with image-related data, including page thumbnails, figures, captions, and image modality information. We demonstrate the system on a subset of the CORD-19 document collection. A quantitative evaluation demonstrates higher precision and recall for biomedical document retrieval. A qualitative evaluation with domain experts further highlights our solution’s benefits to biomedical search.
Availability and implementation
A demonstration is available at https://runachay.evl.uic.edu/scholar. Our code and image models can be accessed via github.com/uic-evl/bio-search. The dataset is continuously expanded.