The rapid increase in the flow rate of published digital information in all disciplines has resulted in a pressing need for techniques that can simplify the use of this information. The chemistry literature is very rich with information about chemical entities. Extracting molecules and their related properties and activities from the scientific literature to “text mine” these extracted data and determine contextual relationships helps research scientists, particularly those in drug development. One of the most important challenges in chemical text mining is the recognition of chemical entities mentioned in the texts. In this review, the authors briefly introduce the fundamental concepts of chemical literature mining, the textual contents of chemical documents, and the methods of naming chemicals in documents. We sketch out dictionary-based, rule-based and machine learning, as well as hybrid chemical named entity recognition approaches with their applied solutions. We end with an outlook on the pros and cons of these approaches and the types of chemical entities extracted.
Purpose-Many opinion-mining systems and tools have been developed to provide users with the attitudes of people toward entities and their attributes or the overall polarities of documents. In addition, side effects are one of the critical measures used to evaluate a patient's opinion for a particular drug. However, side effect recognition is a challenging task, since side effects coincide with disease symptoms lexically and syntactically. The purpose of this paper is to extract drug side effects from drug reviews as an integral implicit-opinion words. Design/methodology/approach-This paper proposes a detection algorithm to a medical-opinionmining system using rule-based and support vector machines (SVM) algorithms. A corpus from 225 drug reviews was manually annotated by a medical expert for training and testing. Findings-The results show that SVM significantly outperforms a rule-based algorithm. However, the results of both algorithms are encouraging and a good foundation for future research. Obviating the limitations and exploiting combined approaches would improve the results. Practical implications-An automatic extraction for adverse drug effects information from online text can help regulatory authorities in rapid information screening and extraction instead of manual inspection and contributes to the acceleration of medical decision support and safety alert generation. Originality/value-The results of this study can help database curators in compiling adverse drug effects databases and researchers to digest the huge amount of textual online information which is growing rapidly.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.