2021
DOI: 10.1063/5.0035530
|View full text |Cite
|
Sign up to set email alerts
|

Improving molecular force fields across configurational space by combining supervised and unsupervised machine learning

Abstract: The training set of atomic configurations is key to the performance of any Machine Learning Force Field (MLFF) and, as such, the training set selection determines the applicability of the MLFF model for predictive molecular simulations. However, most atomistic reference datasets are inhomogeneously distributed across configurational space (CS), and thus, choosing the training set randomly or according to the probability distribution of the data leads to models whose accuracy is mainly defined by the most commo… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
42
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
4
2

Relationship

0
10

Authors

Journals

citations
Cited by 31 publications
(42 citation statements)
references
References 47 publications
0
42
0
Order By: Relevance
“…These findings highlight the importance of performing a more homogeneous sampling of the configurational space to construct high-quality training sets. Indeed, as detailed in a recent publication, 107 the predictive performance of various ML potentials can be significantly boosted by applying unsupervised learning techniques to control the undersampling of physically relevant molecular structures that fall into low probability regions of the configurational space.…”
Section: Training Set Sampling and Conformational Analysismentioning
confidence: 99%
“…These findings highlight the importance of performing a more homogeneous sampling of the configurational space to construct high-quality training sets. Indeed, as detailed in a recent publication, 107 the predictive performance of various ML potentials can be significantly boosted by applying unsupervised learning techniques to control the undersampling of physically relevant molecular structures that fall into low probability regions of the configurational space.…”
Section: Training Set Sampling and Conformational Analysismentioning
confidence: 99%
“…The insufficient nature of mean error metrics has been pointed out before. [37][38][39] In addition to the above data sets, we also demonstrate the use of ACE on a slightly larger, significantly more flexible molecule that is more representative of the needs of medicinal chemistry applications.…”
Section: Introductionmentioning
confidence: 84%
“…The insufficient nature of mean error metrics has been pointed out before. [37][38][39] In addition to the above data sets, we also demonstrate the use of ACE on a slightly larger, significantly more flexible molecule that is more representative of the needs of medicinal chemistry applications.…”
Section: Introductionmentioning
confidence: 84%