As the size of biomedical documents increases, automatic gene or protein based document indexing and ranking models becomes increasingly important on large biomedical databases. Traditional biomedical single entity based document indexing and ranking models restricts search space on high dimensional feature space. However, traditional biomedical document ranking models do not find and extract the relevant documents using the genes or proteins. Also, traditional ranking models are not efficient to rank the biomedical documents using weighted genes or weighted proteins on large biomedical repositories. We focus on the problem of indexing, extracting and ranking biomedical document sets using gene or protein entities on large databases. In the proposed model, a novel MapReduce based natural language processing framework is designed and implemented on large biomedical databases using weighted gene or protein measures and document ranking score. Experimental results show that the proposed model has high contextual ranking accuracy, less search space and time consumption compared to the traditional biomedical document ranking models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.