A significant amount of information about drug-related safety issues such as adverse effects are published in medical case reports that can only be explored by human readers due to their unstructured nature. The work presented here aims at generating a systematically annotated corpus that can support the development and validation of methods for the automatic extraction of drug-related adverse effects from medical case reports. The documents are systematically double annotated in various rounds to ensure consistent annotations. The annotated documents are finally harmonized to generate representative consensus annotations. In order to demonstrate an example use case scenario, the corpus was employed to train and validate models for the classification of informative against the non-informative sentences. A Maximum Entropy classifier trained with simple features and evaluated by 10-fold cross-validation resulted in the F₁ score of 0.70 indicating a potential useful application of the corpus.
The sheer amount of information about potential adverse drug events published in medical case reports pose major challenges for drug safety experts to perform timely monitoring. Efficient strategies for identification and extraction of information about potential adverse drug events from free‐text resources are needed to support pharmacovigilance research and pharmaceutical decision making. Therefore, this work focusses on the adaptation of a machine learning‐based system for the identification and extraction of potential adverse drug event relations from MEDLINE case reports. It relies on a high quality corpus that was manually annotated using an ontology‐driven methodology. Qualitative evaluation of the system showed robust results. An experiment with large scale relation extraction from MEDLINE delivered under‐identified potential adverse drug events not reported in drug monographs. Overall, this approach provides a scalable auto‐assistance platform for drug safety professionals to automatically collect potential adverse drug events communicated as free‐text data.
Speculative statements communicating experimental findings are frequently found in scientific articles, and their purpose is to provide an impetus for further investigations into the given topic. Automated recognition of speculative statements in scientific text has gained interest in recent years as systematic analysis of such statements could transform speculative thoughts into testable hypotheses. We describe here a pattern matching approach for the detection of speculative statements in scientific text that uses a dictionary of speculative patterns to classify sentences as hypothetical. To demonstrate the practical utility of our approach, we applied it to the domain of Alzheimer's disease and showed that our automated approach captures a wide spectrum of scientific speculations on Alzheimer's disease. Subsequent exploration of derived hypothetical knowledge leads to generation of a coherent overview on emerging knowledge niches, and can thus provide added value to ongoing research activities.
Changes in drug labels can be predicted automatically using data and text mining techniques. Text mining technology is mature and well-placed to support the pharmacovigilance tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.