Information retrieval with concept-based pseudo-relevance feedback in MEDLINE

Jalali, Vahid; Borujerdi, Mohammad Reza Matash

doi:10.1007/s10115-010-0327-7

Cited by 22 publications

(15 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The assignment of MeSH terms is a binary decision by professionals based on their interpretation of the content and use of the thesaurus. While MeSH terms have shown great effectiveness in many IR applications (Shin & Han, 2004; Meij, et al, 2010; Jalali & Borujerdi, 2011), the current binary model of description using MeSH terms is insufficient in reflecting the inherent uncertainties in the subject indexing process (Mai, 2001). It has been noted that a piece of work can be related to multiple facets and each facet could have different importance depending on whether it is the major or minor point.…”

Section: Introductionmentioning

confidence: 99%

Automatically infer subject terms and documents associations through text mining

Lü

Mao

2013

Proc of Assoc for Info

View full text Add to dashboard Cite

Subject indexing is an intellectual intensive process that bears many inherent uncertainties. Existing subject index systems generally produce binary outcomes on whether assigning an indexing term or not, which does not sufficiently reflect to which extent the indexing terms are associated with documents. On the other hand, probabilistic models have seen great success in capturing the uncertainties in the automatic indexing process. One hurdle to achieving weighted indexing in manual subject indexing process is the practical burden that could be added to the already intensive indexing process. In this study, we propose a method to automatically infer the associations between subject terms and documents through text mining. By uncovering the connections between MeSH terms and document text, we are able to derive the weights of MeSH terms in documents. Our initial results suggest that the new method is feasible and promising. The study has practical implications for improving subject indexing practice.

show abstract

Section: Introductionmentioning

confidence: 99%

Automatically infer subject terms and documents associations through text mining

Lü

Mao

2013

Proc of Assoc for Info

View full text Add to dashboard Cite

show abstract

“…Queries in the dataset are, on average, 14 terms long, which is much shorter than the queries considered in this article (80 terms). After its introduction, the OHSUMED collection has been extensively used to evaluate classification (e.g., Genkin, Lewis, & Madigan, ; Han & Karypis, ; Xu & Li, ), learning to rank (e.g., Cao et al, ; Duh & Kirchhoff, ; Liu, Xu, Qin, Xiong, & Li, ), and query reformulation (Abdou & Savoy, ; Dong, Srimani, & Wang, ; Haveliwala, ; Hersh, Price, & Donohoe, ; Jalali & Borujerdi, ; Liu & Chu, ; Srinivasan, ; Thesprasith & Jaruskulchai, ). Works in the latter group are the most similar to our systems; they can be further partitioned based on the approach used: ontology‐based reformulation, Pseudo Relevance Feedback (PRF), and a combination of the two.…”

Section: Related Workmentioning

confidence: 99%

Learning to reformulate long queries for clinical decision support

Soldaini

Yates

Goharian

2017

Asso for Info Science & Tech

View full text Add to dashboard Cite

The large volume of biomedical literature poses a serious problem for medical professionals, who are often struggling to keep current with it. At the same time, many health providers consider knowledge of the latest literature in their field a key component for successful clinical practice. In this work, we introduce two systems designed to help retrieving medical literature. Both receive a long, discursive clinical note as input query, and return highly relevant literature that could be used in support of clinical practice. The first system is an improved version of a method previously proposed by the authors; it combines pseudo relevance feedback and a domain-specific term filter to reformulate the query. The second is an approach that uses a deep neural network to reformulate a clinical note. Both approaches were evaluated on the 2014 and 2015 TREC CDS datasets; in our tests, they outperform the previously proposed method by up to 28% in inferred NDCG; furthermore, they are competitive with the state of the art, achieving up to 8% improvement in inferred NDCG.

show abstract

“…Kayaalp et al 2003;Díaz-Galiano et al 2009;Pestana 2009;Jalali and Borujerdi 2011;Yeganova et al 2011;Darmoni et al 2012) and word sense disambiguation (e.g. Kayaalp et al 2003;Díaz-Galiano et al 2009;Pestana 2009;Jalali and Borujerdi 2011;Yeganova et al 2011;Darmoni et al 2012) and word sense disambiguation (e.g.…”

Section: Pubmedmentioning

confidence: 99%

Text Genres and Registers: The Computation of Linguistic Features

Fang

Cao

2015

View full text Add to dashboard Cite

Information retrieval with concept-based pseudo-relevance feedback in MEDLINE

Cited by 22 publications

References 17 publications

Automatically infer subject terms and documents associations through text mining

Automatically infer subject terms and documents associations through text mining

Learning to reformulate long queries for clinical decision support

Text Genres and Registers: The Computation of Linguistic Features

Contact Info

Product

Resources

About