2024
DOI: 10.1088/2632-2153/ad1626
|View full text |Cite
|
Sign up to set email alerts
|

Synthetic pre-training for neural-network interatomic potentials

John L A Gardner,
Kathryn T Baker,
Volker L Deringer

Abstract: Machine learning (ML) based interatomic potentials have transformed the field of atomistic materials modelling. However, ML potentials depend critically on the quality and quantity of quantum-mechanical reference data with which they are trained, and therefore developing datasets and training pipelines is becoming an increasingly central challenge. Leveraging the idea of "synthetic" (artificial) data that is common in other areas of ML research, we here show that synthetic atomistic data, themselves obtained a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 8 publications
(2 citation statements)
references
References 70 publications
0
2
0
Order By: Relevance
“…The combination of active learning strategies using cheap UIP trajectories to generate synthetic data sets for pretraining MLIPs can also greatly increase the efficiency of generating new data sets and expanding current ones. 27,34 Making use of already existing data sets, although at different fidelity levels, can be explored in multifidelity or multimodality approaches, recently shown to be beneficial and accelerating fine-tuning more than 100 ×. 35 Additionally, there is still plenty of exploration of potential gains related to the architectural design of the models.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…The combination of active learning strategies using cheap UIP trajectories to generate synthetic data sets for pretraining MLIPs can also greatly increase the efficiency of generating new data sets and expanding current ones. 27,34 Making use of already existing data sets, although at different fidelity levels, can be explored in multifidelity or multimodality approaches, recently shown to be beneficial and accelerating fine-tuning more than 100 ×. 35 Additionally, there is still plenty of exploration of potential gains related to the architectural design of the models.…”
Section: Discussionmentioning
confidence: 99%
“…In terms of efficiency, fine-tuning UIPs can greatly accelerate the training by incorporating alchemical transferable knowledge from the large bulk data set (in this case, MPtrj), thus requiring only a modest data set to achieve sufficient accuracy in specialized tasks. The combination of active learning strategies using cheap UIP trajectories to generate synthetic data sets for pretraining MLIPs can also greatly increase the efficiency of generating new data sets and expanding current ones. , Making use of already existing data sets, although at different fidelity levels, can be explored in multifidelity or multimodality approaches, recently shown to be beneficial and accelerating fine-tuning more than 100 × . Additionally, there is still plenty of exploration of potential gains related to the architectural design of the models.…”
Section: Discussionmentioning
confidence: 99%