ESOL:  Estimating Aqueous Solubility Directly from Molecular Structure

Delaney, John S.

doi:10.1021/ci034243x

Cited by 750 publications

(601 citation statements)

References 24 publications

(53 reference statements)

Supporting

Mentioning

596

Contrasting

Unclassified

Order By: Relevance

“…To empirically exemplify this point, using the inner approach in Lusci et al (2013) and the outer approach in Duvenaud et al (2015) on the benchmark solubility data set in Delaney (2004), we obtain almost identical RMSE (root mean square error) of 0.61 and 0.60 respectively, in line with the best results reported in the literature.…”

Section: Discussionsupporting

confidence: 82%

The inner and outer approaches to the design of recursive neural architectures

Baldi

2017

Data Min Knowl Disc

View full text Add to dashboard Cite

Feedforward neural network architectures work well for numerical data of fixed size, such as images. For variable size, structured data, such as sequences, d dimensional grids, trees, and other graphs, recursive architectures must be used. We distinguish two general approaches for the design of recursive architectures in deep learning, the inner and the outer approach. The inner approach uses neural networks recursively inside the data graphs, essentially to "crawl" the edges of the graphs in order to compute the final output. It requires acyclic orientations of the underlying graphs. The outer approach uses neural networks recursively outside the data graphs and regardless of their orientation. These neural networks operate orthogonally to the data graph and progressively "fold" or aggregate the input structure to produce the final output. The distinction is illustrated using several examples from the fields of natural language processing, chemoinformatics, and bioinformatics, and applied to the problem of learning from variable-size sets.

show abstract

Section: Discussionsupporting

confidence: 82%

The inner and outer approaches to the design of recursive neural architectures

Baldi

2017

Data Min Knowl Disc

View full text Add to dashboard Cite

show abstract

“…Recently, Delaney studied a much larger data set of 2874 compounds by using 9 simple descriptors that included calculated logP, molecular weight, aromatic proportion, non-carbon proportion, polar surface area, etc. 13 The performance of the model was listed as follows: n ) 2874, m ) 9, R 2 ) 0.69, UAE ) 0.75, RMSE ) 1.01. In another report, Votano and Parham constructed a set of models with topological structure indices as descriptors using a variety of data analysis methods.…”

Section: Introductionmentioning

confidence: 99%

Development of Reliable Aqueous Solubility Models and Their Application in Druglike Analysis

Wang

Krudy

Hou

et al. 2007

J. Chem. Inf. Model.

105

109

View full text Add to dashboard Cite

In this work, two reliable aqueous solubility models, ASMS (aqueous solubility based on molecular surface) and ASMS-LOGP (aqueous solubility based on molecular surface using ClogP as a descriptor), were constructed by using atom type classified solvent accessible surface areas and several molecular descriptors for a diverse data set of 1708 molecules. For ASMS (without using ClogP as a descriptor), the leave-oneout q 2 and root-mean-square error (RMSE) were 0.872 and 0.748 log unit, respectively. ASMS-LOGP was slightly better than ASMS (q 2 ) 0.886, RMSE ) 0.705). Both models were extensively validated by three cross-validation tests and encouraging predictability was achieved. High throughput aqueous solubility prediction was conducted for a number of data sets extracted from several widely used databases. We found that real drugs are about 20-fold more soluble than the so-called druglike molecules in the ZINC database, which have no violation of Lipinski's "Rule of 5" at all. Specifically, oral drugs are about 16-fold more soluble, while injection drugs are 50-60-fold more soluble. If the criterion of a molecule to be soluble is set to -5 log unit, about 85% of real drugs are predicted as soluble; in contrast only 50% of druglike molecules in ZINC are soluble. We concluded that the two models could be served as a rule in druglike analysis and an efficient filter in prioritizing compound libraries prior to high throughput screenings (HTS).

show abstract

“…For several decades, researchers have tried to predict solubility parameters by applying artificial neural networks (ANNs) [3][4][5][6], genetic algorithms (GAs) [6], multiple linear regressions [7], partial least squares (PLSs) [8,9], support vector machines (SVMs) [10,11], random forest (RF) models [12] and so on. However, there are not many previous works to directly compute solubility parameters from solvation free energy that is the fundamental physical variable determining the solvation process.…”

Section: Introductionmentioning

confidence: 99%

Solubility Prediction of Organic Ionic Compounds with Computational Methods for Photoresist Application

Ryu¹,

Kim²,

Yoon³

et al. 2016

J. Photopol. Sci. Technol.

View full text Add to dashboard Cite

Solubility prediction of organic ionic compounds in both aqueous and organic solvents is important for understanding and optimizing lithographic performances. In this study, we proposed computational methods to predict solubility of organic ionic compounds. To compare the predicted solubility with the experimental one, we applied a multiple linear regression model by changing a set of explanatory variables. We conclude that the variables of solvation free energies of cation-anion pair, cation and anion, which are �� ○ , �� ○ and �� ○ respectively, will be sufficient to describe the relationship between the predicted and experimental solubility values. We expect that the more accurate empirical model for quantitative prediction of solubility of organic ionic compounds by expanding these regression models and further optimizing the parameters based on larger set of experimental values will be reserved.

show abstract

ESOL: Estimating Aqueous Solubility Directly from Molecular Structure

Cited by 750 publications

References 24 publications

The inner and outer approaches to the design of recursive neural architectures

The inner and outer approaches to the design of recursive neural architectures

Development of Reliable Aqueous Solubility Models and Their Application in Druglike Analysis

Solubility Prediction of Organic Ionic Compounds with Computational Methods for Photoresist Application

Contact Info

Product

Resources

About