Abstract. In this paper, we study how to automatically predict reliability of web pages in the medical domain. Assessing reliability of online medical information is especially critical as it may potentially influence vulnerable patients seeking help online. Unfortunately, there are no automated systems currently available that can classify a medical webpage as being reliable, while manual assessment cannot scale up to process the large number of medical pages on the Web. We propose a supervised learning approach to automatically predict reliability of medical webpages. We developed a gold standard dataset using the standard reliability criteria defined by the Health on Net Foundation and systematically experimented with different link and content based feature sets. Our experiments show promising results with prediction accuracies of over 80%. We also show that our proposed prediction method is useful in applications such as reliability-based re-ranking and automatic website accreditation.
Web communities such as healthcare web forums serve as popular platforms for users to get their complex medical queries resolved. A typical forum thread contains a query in its first post, and a discussion around it in subsequent posts. However many users do not receive satisfactory responses from other members in the community, leaving them dissatisfied. We propose to help these users by exploiting an existing collection of discussion threads.Often many users suffer from the same medical condition and start multiple discussion threads on very similar queries. In this paper we develop and evaluate a plethora of specialized search methods that treat an entire unresolved forum post as a query, and retrieve forum threads discussing similar problems to help resolve it. The task is more challenging than a traditional document retrieval problem, since forum posts can contain a lot of irrelevant background information. The discussion threads to be retrieved are also quite different from traditional unstructured text documents. We evaluate our results on a dataset comprising over 350K discussion threads and show that our proposed methods outperform state of the art retrieval methods for the task. In particular, method based on non-uniform weighting of thread posts and semantic analysis of the query text perform quite well.
The exponential growth in the volume of publications in the biomedical domain has made it impossible for an individual to keep pace with the advances. Even though evidence-based medicine has gained wide acceptance, the physicians are unable to access the relevant information in the required time, leaving most of the questions unanswered. This accentuates the need for fast and accurate biomedical question answering systems. In this paper we introduce INDOC—a biomedical question answering system based on novel ideas of indexing and extracting the answer to the questions posed. INDOC displays the results in clusters to help the user arrive the most relevant set of documents quickly. Evaluation was done against the standard OHSUMED test collection. Our system achieves high accuracy and minimizes user effort.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.