2019
DOI: 10.1021/acs.jcim.9b00498
|View full text |Cite
|
Sign up to set email alerts
|

Prediction of pKa Using Machine Learning Methods with Rooted Topological Torsion Fingerprints: Application to Aliphatic Amines

Abstract: The acid–base dissociation constant, pK a, is a key parameter to define the ionization state of a compound and directly affects its biopharmaceutical profile. In this study, we developed a novel approach for pK a prediction using rooted topological torsion fingerprints in combination with five machine learning (ML) methods: random forest, partial least squares, extreme gradient boosting, lasso regression, and support vector regression. With a large and diverse set of 14 499 experimental pK a values, pK a model… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
29
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 36 publications
(30 citation statements)
references
References 72 publications
1
29
0
Order By: Relevance
“…5,6,14 The LFER models apply the Hammett equations to predict pK a by classifying the molecule to a parent class and modifying the pK a value of the parent class with a property of its substituents. Machine learning models 11,12,15 usually use a molecular environment rooted at the ionization center as the descriptor to develop the pK a prediction approach by learning from data.…”
Section: ■ Introductionmentioning
confidence: 99%
“…5,6,14 The LFER models apply the Hammett equations to predict pK a by classifying the molecule to a parent class and modifying the pK a value of the parent class with a property of its substituents. Machine learning models 11,12,15 usually use a molecular environment rooted at the ionization center as the descriptor to develop the pK a prediction approach by learning from data.…”
Section: ■ Introductionmentioning
confidence: 99%
“…In reaction prediction, ML algorithms have proven helpful in identifying the most likely types of reactions applicable to a given substrate under given reaction conditions, [1f, 4, 6e] and in the choice of site‐ or regioisomers that can form [7] . For relatively simple substrates and non‐stereoselective chemistries with sufficient numbers of literature precedents, the accuracy of these models has been satisfactory, reflecting the adequacy of molecular descriptors embodying information about atomic composition and connectivity (various 2D and 3D fingerprints, [8a–d] or descriptor libraries like DScribe [8e] ), electronic effects of substituents (e.g., Hammett constants [7a] or QM‐derived measures [9] ), as well as some measures of steric bulk in the vicinity of reaction center (e.g., TSEI indices we used to predict the outcomes of Diels Alder reactions [7a] ). Simultaneously, there has been progress in developing predictors capturing stereochemical information [1f, 10] and in predicting outcomes of stereoselective reactions controlled by chiral catalysts (cf.…”
Section: Figurementioning
confidence: 99%
“…[22][23][24][25][26] Applications of SVMs in chemistry include bioactivity prediction, toxicity-related properties and physicochemical property prediction. 1,[26][27][28][29] A dataset consisting of chemical structures or reactions must converted to a machine readable format before presented to a machine learning algorithm. Molecular descriptors are based on the structural, physiochemical, electronic, or topological nature of molecules.…”
Section: Introductionmentioning
confidence: 99%
“…40 Fingerprints have also been utilised in kernel-based QSAR/QSPR relationship models, using the Tanimoto or RBF kernel. [27][28][29] Molecular graphs are another two-dimensional representation that depict the atoms and bonds within molecules as a set of nodes and edges. The global molecular structure is considered, in contrast to the local environments in fingerprints.…”
Section: Introductionmentioning
confidence: 99%