BackgroundAdverse Drug Reactions are one of the leading causes of injury or death among patients undergoing medical treatments. Not all Adverse Drug Reactions are identified before a drug is made available in the market. Current post-marketing drug surveillance methods, which are based purely on voluntary spontaneous reports, are unable to provide the early indications necessary to prevent the occurrence of such injuries or fatalities. The objective of this research is to extract reports of adverse drug side-effects from messages in online healthcare forums and use them as early indicators to assist in post-marketing drug surveillance.MethodsWe treat the task of extracting adverse side-effects of drugs from healthcare forum messages as a sequence labeling problem and present a Hidden Markov Model(HMM) based Text Mining system that can be used to classify a message as containing drug side-effect information and then extract the adverse side-effect mentions from it. A manually annotated dataset from http://www.medications.comis used in the training and validation of the HMM based Text Mining system.ResultsA 10-fold cross-validation on the manually annotated dataset yielded on average an F-Score of 0.76 from the HMM Classifier, in comparison to 0.575 from the Baseline classifier. Without the Plain Text Filter component as a part of the Text Processing module, the F-Score of the HMM Classifier was reduced to 0.378 on average, while absence of the HTML Filter component was found to have no impact. Reducing the Drug names dictionary size by half, on average reduced the F-Score of the HMM Classifier to 0.359, while a similar reduction to the side-effects dictionary yielded an F-Score of 0.651 on average. Adverse side-effects mined from http://www.medications.comand http://www.steadyhealth.comwere found to match the Adverse Drug Reactions on the Drug Package Labels of several drugs. In addition, some novel adverse side-effects, which can be potential Adverse Drug Reactions, were also identified.ConclusionsThe results from the HMM based Text Miner are encouraging to pursue further enhancements to this approach. The mined novel side-effects can act as early indicators for health authorities to help focus their efforts in post-marketing drug surveillance.
BackgroundDetecting epistatic interactions plays a significant role in improving pathogenesis, prevention, diagnosis, and treatment of complex human diseases. Applying machine learning or statistical methods to epistatic interaction detection will encounter some common problems, e.g., very limited number of samples, an extremely high search space, a large number of false positives, and ways to measure the association between disease markers and the phenotype.ResultsTo address the problems of computational methods in epistatic interaction detection, we propose a score-based Bayesian network structure learning method, EpiBN, to detect epistatic interactions. We apply the proposed method to both simulated datasets and three real disease datasets. Experimental results on simulation data show that our method outperforms some other commonly-used methods in terms of power and sample-efficiency, and is especially suitable for detecting epistatic interactions with weak or no marginal effects. Furthermore, our method is scalable to real disease data.ConclusionsWe propose a Bayesian network-based method, EpiBN, to detect epistatic interactions. In EpiBN, we develop a new scoring function, which can reflect higher-order epistatic interactions by estimating the model complexity from data, and apply a fast Branch-and-Bound algorithm to learn the structure of a two-layer Bayesian network containing only one target node. To make our method scalable to real data, we propose the use of a Markov chain Monte Carlo (MCMC) method to perform the screening process. Applications of the proposed method to some real GWAS (genome-wide association studies) datasets may provide helpful insights into understanding the genetic basis of Age-related Macular Degeneration, late-onset Alzheimer's disease, and autism.
Adverse drug reactions (ADRs) are a major public health concern, causing over 100,000 fatalities in the United States every year with an annual cost of $136 billion. Early detection and accurate prediction of ADRs is thus vital for drug development and patient safety. Multiple scientific disciplines, namely pharmacology, pharmacovigilance, and pharmacoinformatics, have been addressing the ADR problem from different perspectives. With the same goal of improving drug safety, this article summarizes and links the research efforts in the multiple disciplines into a single framework from comprehensive understanding of the interactions between drugs and biological system and the identification of genetic and phenotypic predispositions of patients susceptible to higher ADR risks and finally to the current state of implementation of medication-related decision support systems. We start by describing available computational resources for building drug-target interaction networks with biological annotations, which provides a fundamental knowledge for ADR prediction. Databases are classified by functions to help users in selection. Post-marketing surveillance is then introduced where data-driven approach can not only enhance the prediction accuracy of ADRs but also enables the discovery of genetic and phenotypic risk factors of ADRs. Understanding genetic risk factors for ADR requires well organized patient genetics information and analysis by pharmacogenomic approaches. Finally, current state of clinical decision support systems is presented and described how clinicians can be assisted with the integrated knowledgebase to minimize the risk of ADR. This review ends with a discussion of existing challenges in each of disciplines with potential solutions and future directions.
Recent technological advances allow for high throughput profiling of biological systems in a cost-efficient manner. The low cost of data generation is leading us to the “big data” era. The availability of big data provides unprecedented opportunities but also raises new challenges for data mining and analysis. In this review, we introduce key concepts in the analysis of big data, including both “machine learning” algorithms as well as “unsupervised” and “supervised” examples of each. We note packages for the R programming language that are available to perform machine learning analyses. In addition to programming based solutions, we review webservers that allow users with limited or no programming background to perform these analyses on large data compendia.
High–throughput technologies used to interrogate transcriptomes have been generating a great amount of publicly available gene expression data. For raw diseases that lack of clinical samples and research funding, there is a practical benefit to jointly analyze existing datasets commonly related to a specific rare disease. In this study, we collected a number of independently generated transcriptome data sets from four species: Human, Fly, Mouse and Worm. All data sets included samples with both normal and abnormal mitochondrial functions. We reprocessed each data set to standardize format, scale and gene annotation and used HomoloGene database to map genes between species. Standard procedure was also applied to compare gene expression profiles of normal and abnormal mitochondrial functions to identify differentially expressed genes. We further used meta–analysis and other integrative analyses to recognize patterns across data sets and species. Novel insights related to mitochondrial dysfunctions was revealed via these analyses, such as a group of genes consistently dysregulated by impaired mitochondrial function in multiple species. This study created a template for the study of rare diseases using genomic technologies and advanced statistical methods. All data and results generated by this study are freely available and stored at http://goo.gl/nOGWC2, to support further data mining.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.