The vertical search engine searches in the text of specific domain. In this project, we built a pharmaceutical vertical search engine using a supervised learning classifier, Rocchio, to classify documents into two different classes; one pharmaceutical and another computer science. For learning of the classifier, small document collection is created. It is evaluated using abstracts from 86 research papers and accuracy yields 90% results. An inverted index is built containing terms from selected pharmaceutical documents. An interface is also developed to interact with the user. User can issue simple keyword like queries and documents are retrieved using TF-IDF statistics and BM25 weighting scheme. Retrieved results are ranked in descending order from the highest relevance score to lowest relevance score. New information can be classified and added to the index using search interface. The system is designed and developed using the Spiral Model and implemented in dot.net tools. The survey and interviewing techniques are also used to identify the needs and prioritizing tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.