2023
DOI: 10.1021/acs.jcim.3c00958
|View full text |Cite
|
Sign up to set email alerts
|

Explainable Graph Neural Networks with Data Augmentation for Predicting pKa of C–H Acids

Hongle An,
Xuyang Liu,
Wensheng Cai
et al.

Abstract: The pK a of C–H acids is an important parameter in the fields of organic synthesis, drug discovery, and materials science. However, the prediction of pK a is still a great challenge due to the limit of experimental data and the lack of chemical insight. Here, a new model for predicting the pK a values of C–H acids is proposed on the basis of graph neural networks (GNNs) and data augmentation. A message passing unit (MPU) was used to extract the topological and target-related information from the molecular grap… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
5
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
4

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 59 publications
0
5
0
Order By: Relevance
“…The AD of the AttenGpKa model is defined based on the similarity of molecules in the training set, as determined by Euclidean distance using Morgan fingerprints . Specific details can be found in our previous work . By calculating the distance matrix of compounds in the training set, the average Euclidean distance d̅ and the standard deviation σ of these distances are obtained, which are 6.44 and 2.95, respectively.…”
Section: Results and Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…The AD of the AttenGpKa model is defined based on the similarity of molecules in the training set, as determined by Euclidean distance using Morgan fingerprints . Specific details can be found in our previous work . By calculating the distance matrix of compounds in the training set, the average Euclidean distance d̅ and the standard deviation σ of these distances are obtained, which are 6.44 and 2.95, respectively.…”
Section: Results and Discussionmentioning
confidence: 99%
“…31 Specific details can be found in our previous work. 30 By calculating the distance matrix of compounds in the training set, the average Euclidean distance d̅ and the standard deviation σ of these distances are obtained, which are 6.44 and 2.95, respectively. The distance threshold D T for describing the AD is calculated using D T = d̅ + Z*σ, where Z is an empirical parameter representing the significance level.…”
Section: ■ Results and Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Many papers on diverse directions are also gathered into this collection to describe the application of ML in cheminformatic studies. A series of articles in this collection focused on the prediction of the physicochemical characteristics of chemical compounds, and these characteristics included: temperature-dependent viscosity, solvation Gibbs energies, p K a, metal coordination geometry, binding energy, and electronic property. Another set of papers focused on molecular generation and design by introducing software/tool, , developing transformer-based new algorithms, and optimizing molecule via molecular scaffold decoration . The remaining tested the performance of ChatGPT in chemical generation and similarity indexing …”
mentioning
confidence: 99%