GPURFSCREEN: a GPU based virtual screening tool using random forest classifier

Jayaraj, P. B.; Ajay, Mathias K.; Nufail, M.; Gopakumar, G; Jaleel, U. C. Abdul

doi:10.1186/s13321-016-0124-8

Cited by 22 publications

(14 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To address this challenge, several statistical measures that require less computational expense, such as the maximal information coefficient (MIC), have been developed to promote the efficient handling of big data. 124 In parallel, development of computational infrastructure that can be rapidly expanded, such as the Hadoop file system, 125 crowdsourcing, 126 and massively parallel processing hardware (including the recruitment of graphical processing units 127 ), are being actively explored and adopted.…”

Section: Big Data-driven Techniques For Drug Discoverymentioning

confidence: 99%

Using Big Data to Discover Diagnostics and Therapeutics for Gastrointestinal and Liver Diseases

et al. 2017

View full text Add to dashboard Cite

Technologies such as genome sequencing, gene expression profiling, proteomic and metabolomic analyses, electronic medical records, and patient-reported health information have produced large amounts of data, from various populations, cell types, and disorders (big data). However, these data must be integrated and analyzed if they are to produce models or concepts about physiologic function or mechanisms of pathogenesis. Many of these data are available to the public, allowing researchers anywhere to search for markers of specific biologic processes or therapeutic targets for specific diseases or patient types. We review recent advances in the fields of computational and systems biology, and highlight opportunities for researchers to use big data sets in the fields of gastroenterology and hepatology, to complement traditional means of diagnostic and therapeutic discovery.

show abstract

Section: Big Data-driven Techniques For Drug Discoverymentioning

confidence: 99%

Using Big Data to Discover Diagnostics and Therapeutics for Gastrointestinal and Liver Diseases

et al. 2017

View full text Add to dashboard Cite

show abstract

“…For AID778, the Random Forest gave the AUC of 62 % with all data sets (about 95,000 compounds) [39] and we obtained the AUC of 71.28 % using the WL method with 4000 data sets. Firstly, we compared our results for AS with the previous results.…”

Section: Resultsmentioning

confidence: 80%

Graph Classification of Molecules Using Force Field Atom and Bond Types

Jippo

Matsuo

Kikuchi

et al. 2019

Molecular Informatics

View full text Add to dashboard Cite

Classification of the biological activities of chemical substances is important for developing new medicines efficiently. Various machine learning methods are often employed to screen large libraries of compounds and predict the activities of new substances by training the molecular structure-activity relationships. One such method is graph classification, in which a molecular structure can be represented in terms of a labeled graph with nodes that correspond to atoms and edges that correspond to the bonds between these atoms. In a conventional graph definition, atomic symbols and bond orders are employed as node and edge labels, respectively. In this study, we developed new graph definitions using the assignment of atom and bond types in the force fields of molecular dynamics methods as node and edge labels, respectively. We found that these graph definitions improved the accuracies of activity classifications for chemical substances using graph kernels with support vector machines and deep neural networks. The higher accuracies obtained using our proposed definitions can enhance the development of the materials informatics using graph-based machine learning methods.

show abstract

“…To achieve negatives, the positive MMRSs and unlabeled MMRSs in each dataset were used as input to train classifier respectively. Eight classification algorithms (Kstar [37], BN [38], IBK [39], J48 [40], RF [41], SVM [42], AdaBoost [43], Bagging [44]) were adopted. These algorithms have been shown effective in various domains of bioinformatics and medicinal chemistry.…”

Section: Methodsmentioning

confidence: 99%

Construction of Metabolism Prediction Models for CYP450 3A4, 2D6, and 2C9 Based on Microsomal Metabolic Reaction System

Zhang

et al. 2016

IJMS

View full text Add to dashboard Cite

During the past decades, there have been continuous attempts in the prediction of metabolism mediated by cytochrome P450s (CYP450s) 3A4, 2D6, and 2C9. However, it has indeed remained a huge challenge to accurately predict the metabolism of xenobiotics mediated by these enzymes. To address this issue, microsomal metabolic reaction system (MMRS)—a novel concept, which integrates information about site of metabolism (SOM) and enzyme—was introduced. By incorporating the use of multiple feature selection (FS) techniques (ChiSquared (CHI), InfoGain (IG), GainRatio (GR), Relief) and hybrid classification procedures (Kstar, Bayes (BN), K-nearest neighbours (IBK), C4.5 decision tree (J48), RandomForest (RF), Support vector machines (SVM), AdaBoostM1, Bagging), metabolism prediction models were established based on metabolism data released by Sheridan et al. Four major biotransformations, including aliphatic C-hydroxylation, aromatic C-hydroxylation, N-dealkylation and O-dealkylation, were involved. For validation, the overall accuracies of all four biotransformations exceeded 0.95. For receiver operating characteristic (ROC) analysis, each of these models gave a significant area under curve (AUC) value >0.98. In addition, an external test was performed based on dataset published previously. As a result, 87.7% of the potential SOMs were correctly identified by our four models. In summary, four MMRS-based models were established, which can be used to predict the metabolism mediated by CYP3A4, 2D6, and 2C9 with high accuracy.

show abstract

GPURFSCREEN: a GPU based virtual screening tool using random forest classifier

Cited by 22 publications

References 21 publications

Using Big Data to Discover Diagnostics and Therapeutics for Gastrointestinal and Liver Diseases

Using Big Data to Discover Diagnostics and Therapeutics for Gastrointestinal and Liver Diseases

Graph Classification of Molecules Using Force Field Atom and Bond Types

Construction of Metabolism Prediction Models for CYP450 3A4, 2D6, and 2C9 Based on Microsomal Metabolic Reaction System

Contact Info

Product

Resources

About