2023
DOI: 10.1063/5.0139611
|View full text |Cite
|
Sign up to set email alerts
|

How to validate machine-learned interatomic potentials

Abstract: Machine learning (ML) approaches enable large-scale atomistic simulations with near-quantum-mechanical accuracy. With the growing availability of these methods there arises a need for careful validation, particularly for physically agnostic models - that is, for potentials which extract the nature of atomic interactions from reference data. Here, we review the basic principles behind ML potentials and their validation for atomic-scale materials modeling. We discuss best practice in defining error metrics based… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
26
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
8
1

Relationship

3
6

Authors

Journals

citations
Cited by 53 publications
(26 citation statements)
references
References 95 publications
0
26
0
Order By: Relevance
“…In each case, mean absolute error (MAE) values are quoted as averaged using a 5-fold cross-validation procedure, where the structures from a single MD trajectory are dedicated completely to either the training or the test set, to avoid training example data leakage. 58 When using less than the full training set, we take a random sample from all atomic environments without replacement. When training networkbased models, which require a validation set, we further split the shuffled training set using one tenth of the set, or 1, 000 points, whichever is lower.…”
Section: Learning Curvesmentioning
confidence: 99%
“…In each case, mean absolute error (MAE) values are quoted as averaged using a 5-fold cross-validation procedure, where the structures from a single MD trajectory are dedicated completely to either the training or the test set, to avoid training example data leakage. 58 When using less than the full training set, we take a random sample from all atomic environments without replacement. When training networkbased models, which require a validation set, we further split the shuffled training set using one tenth of the set, or 1, 000 points, whichever is lower.…”
Section: Learning Curvesmentioning
confidence: 99%
“…An important general issue in ML-methods is the determination of evaluation metrics , for the quantification of a successful training. The behavior of the loss function alone seems not to be a reliable criterion, as in some cases an equilibrated loss is not an indicator of physically meaningful results. , Stocker et al, for example, report MD atomistic simulations of small organic molecules with GemNet potentials based on QM data.…”
Section: Open Challenges and Future Outlookmentioning
confidence: 99%
“…While purely experimental data-driven approaches for the prediction of properties of these materials have been proposed, 90 molecular simulations leveraging ML methods remains an unexplored area, 287 both at the atomistic and the CG level. 288 An important general issue in ML-methods is the determination of evaluation metrics 256,289 for the quantification of a successful training. The behavior of the loss function alone seems not to be a reliable criterion, as in some cases an equilibrated loss is not an indicator of physically meaningful results.…”
Section: Open Challenges and Future Outlookmentioning
confidence: 99%
“…Using these hand-curated systematic training datasets, we focused only on the DeepPot-SE approach and did not assess other MLP models. Such a comparative investigation of MLPs, similar to that of Zuo et al 27 or Morrow et al , 53 would be helpful to the community in assessing the fidelity of the dataset and model combination. We hope this study, particularly the developed and shared datasets, will motivate such extensive comparative studies.…”
Section: Discussionmentioning
confidence: 98%