2004
DOI: 10.1021/ci049971e
|View full text |Cite
|
Sign up to set email alerts
|

Prediction of P-Glycoprotein Substrates by a Support Vector Machine Approach

Abstract: P-glycoproteins (P-gp) actively transport a wide variety of chemicals out of cells and function as drug efflux pumps that mediate multidrug resistance and limit the efficacy of many drugs. Methods for facilitating early elimination of potential P-gp substrates are useful for facilitating new drug discovery. A computational ensemble pharmacophore model has recently been used for the prediction of P-gp substrates with a promising accuracy of 63%. It is desirable to extend the prediction range beyond compounds co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

5
147
0

Year Published

2005
2005
2014
2014

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 158 publications
(152 citation statements)
references
References 48 publications
5
147
0
Order By: Relevance
“…This technique has already gained recognition as one of the most robust and efficient classifiers (21,(56)(57)(58)63). It can tackle nontrivial problems by projecting the original descriptor vectors to a higher dimensional feature space where a clearer division between the two classes of data becomes feasible.…”
Section: Classification Proceduresmentioning
confidence: 99%
“…This technique has already gained recognition as one of the most robust and efficient classifiers (21,(56)(57)(58)63). It can tackle nontrivial problems by projecting the original descriptor vectors to a higher dimensional feature space where a clearer division between the two classes of data becomes feasible.…”
Section: Classification Proceduresmentioning
confidence: 99%
“…A total of 261 compounds which were classified as P-gp substrates or nonsubstrates used in this work were cautiously assembled from literature. 10,15,16,20 After removing 12 compounds of overlap data, we constructed SD file format of 261 compounds using PreADMET S/W. 21 The data set consists of 146 substrates and 115 nonsubstrates.…”
Section: Methodsmentioning
confidence: 99%
“…[11][12][13][14][15][16][17] However, the accurate prediction of P-gp remains a challenge due to the complexity of the understanding physiological mechanisms and the lack of high quality data. For increasing the predictability, many complex statistical machine learning methods were carried out using supervised methods such as artificial neural network (ANN), Bayesian network and support vector machine (SVM).…”
Section: Introductionmentioning
confidence: 99%
“…By using DRAGON Web version 3.0, 37 we derived a total of 1497 1D, 2D, and 3D molecular descriptors from the 3D structure of each compound. These descriptors can be divided into 18 classes including 47 constitutional descriptors, 70 geometrical descriptors, 266 topological descriptors, 150 RDF descriptors, 38 21 molecular walk counts, 39 160 3D-MoRSE descriptors, 40 64 BCUT descriptors, 41 99 WHIM descriptors, 42 21 Galvez topological charge indices, 43 197 GETAWAY descriptors, 44 96 2D autocorrelations, 121 functional groups, 14 charge descriptors, 120 atom-centered descriptors, 4 aromaticity indices, 45 3 empirical descriptors, 41 Randic molecular profiles, 46 and 3 molecular properties. Moreover, an additional set of 105 electrotopological state descriptors 47 and 5 linear solvation energy relationship descriptors 48 were computed by using our own developed code.…”
Section: Datasetsmentioning
confidence: 99%
“…[17][18][19][20][21][22] The aim of this work is to explore the use of support vector machine (SVM) methods for facilitating the prediction of substrates and nonsubstrates and inhibitors and noninhibitors of P450 isoenzymes. SVM has been successfully used in a wide range of problems including p-glycoprotein substrates, 21 blood-brain barrier penetration, 17,18 human intestinal absorption, 20 torsade de pointes prediction, 22 and protein function prediction. 19 The main advantage of SVM over other statistical learning methods is its relatively low sensitivity to data overfitting, even with the use of a large number of redundant and overlapping molecular descriptors.…”
Section: Introductionmentioning
confidence: 99%