2019
DOI: 10.1186/s13321-019-0381-4
|View full text |Cite
|
Sign up to set email alerts
|

Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies

Abstract: We present machine learning (ML) models for hydrogen bond acceptor (HBA) and hydrogen bond donor (HBD) strengths. Quantum chemical (QC) free energies in solution for 1:1 hydrogen-bonded complex formation to the reference molecules 4-fluorophenol and acetone serve as our target values. Our acceptor and donor databases are the largest on record with 4426 and 1036 data points, respectively. After scanning over radial atomic descriptors and ML methods, our final trained HBA and HBD ML models achieve RMSEs of 3.8 k… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
30
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 28 publications
(32 citation statements)
references
References 81 publications
2
30
0
Order By: Relevance
“…These methods are computationally expensive, but the implementation of a machine learning model for prediction of hydrogen bond energies based on a large QM dataset was recently reported. [22] We envision that such ML models for hydrogen bonding strengths will make routine assessment of donor and acceptor sites very fast.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…These methods are computationally expensive, but the implementation of a machine learning model for prediction of hydrogen bond energies based on a large QM dataset was recently reported. [22] We envision that such ML models for hydrogen bonding strengths will make routine assessment of donor and acceptor sites very fast.…”
Section: Resultsmentioning
confidence: 99%
“…Compounds 1-24 were sourced via Aldrich Market Select using the following suppliers: Adooq Bioscience (California, USA) -boceprevir (13), dabigatran etexilate (3), daclatasvir (19), eliglustat (7), ledipasvir (23), tacrolimus (10), telaprevir (14); Aldrich-CPR (Sigma-Aldrich, Wisconsin, USA) -erythromycin (9); Ambeed Inc (Illinois, USA) -aliskiren (1), atazanavir (4), sofosbuvir (8), venetoclax (11); Carbosynth Ltd (Oxfordshire, UK) -pibrentasvir (21); Cayman Chemical -simeprevir (18), velpatasvir (24); MedChem Express (New Jersey, USA) -avanafil (2), cobicistat (5), elbasvir (22), glecaprevir (15), ombitasvir (20); Target Molecule Corp. (Massachusetts, USA) -asunaprevir (12), edoxaban (6), grazoprevir (16), paritaprevir (17). Stock solutions in DMSO with a concentration of 10 mM were prepared and used for all analyses.…”
Section: Experimental Section Materialsmentioning
confidence: 99%
“…Compounds 1-24 were sourced via Aldrich Market Select using the following suppliers: Adooq Bioscience (California, USA) -boceprevir ( 13), dabigatran etexilate (3), daclatasvir (19), eliglustat (7), ledipasvir (23), tacrolimus (10), telaprevir (14); Aldrich-CPR (Sigma-Aldrich, Wisconsin, USA) -erythromycin (9); Ambeed Inc (Illinois, USA) -aliskiren (1), atazanavir (4), sofosbuvir (8), venetoclax (11); Carbosynth Ltd (Oxfordshire, UK) -pibrentasvir (21); Cayman Chemical -simeprevir (18), velpatasvir (24); MedChem Express (New Jersey, USA) -avanafil (2), cobicistat (5), elbasvir (22), glecaprevir (15), ombitasvir (20); Target Molecule Corp. (Massachusetts, USA) -asunaprevir (12), edoxaban (6), grazoprevir (16), paritaprevir (17). Stock solutions in DMSO with a concentration of 10 mM were prepared and used for all analyses.…”
Section: Methodsmentioning
confidence: 99%
“…9 We investigate seven different atomic descriptors of intermediate complexity as input to the ML models (details on the descriptors are given in Table S2 in the supporting information). The atomic descriptors are developed by Finkelmann et al 20,21 and are chosen because they have been successfully applied to the prediction of site of metabolism, 21,22 hydrogen bond donor and acceptor strengths, 23,24 and Ames mutagenicity of primary aromatic amines. 25 Almost all of the descriptors depend on charge model 5 (CM5) atomic charges, 26 which are obtained from a single point calculation using GFN1-xTB as implemented in the open source semiempirical software package xtb version 6.4.0.…”
Section: Atomic Descriptorsmentioning
confidence: 99%