Synthetic pre-training for neural-network interatomic potentials

Machine learning (ML) based interatomic potentials have transformed the field of atomistic materials modelling. However, ML potentials depend critically on the quality and quantity of quantum-mechanical reference data with which they are trained, and therefore developing datasets and training pipeli...

Full description

Bibliographic Details
Main Authors: John L A Gardner, Kathryn T Baker, Volker L Deringer
Format: Article
Language:English
Published: IOP Publishing 2024-01-01
Series:Machine Learning: Science and Technology
Subjects:
Online Access:https://doi.org/10.1088/2632-2153/ad1626
_version_ 1797359683172827136
author John L A Gardner
Kathryn T Baker
Volker L Deringer
author_facet John L A Gardner
Kathryn T Baker
Volker L Deringer
author_sort John L A Gardner
collection DOAJ
description Machine learning (ML) based interatomic potentials have transformed the field of atomistic materials modelling. However, ML potentials depend critically on the quality and quantity of quantum-mechanical reference data with which they are trained, and therefore developing datasets and training pipelines is becoming an increasingly central challenge. Leveraging the idea of ‘synthetic’ (artificial) data that is common in other areas of ML research, we here show that synthetic atomistic data, themselves obtained at scale with an existing ML potential, constitute a useful pre-training task for neural-network (NN) interatomic potential models. Once pre-trained with a large synthetic dataset, these models can be fine-tuned on a much smaller, quantum-mechanical one, improving numerical accuracy and stability in computational practice. We demonstrate feasibility for a series of equivariant graph-NN potentials for carbon, and we carry out initial experiments to test the limits of the approach.
first_indexed 2024-03-08T15:27:16Z
format Article
id doaj.art-365cc2c702354ec79bf6d23c3fe4a19b
institution Directory Open Access Journal
issn 2632-2153
language English
last_indexed 2024-03-08T15:27:16Z
publishDate 2024-01-01
publisher IOP Publishing
record_format Article
series Machine Learning: Science and Technology
spelling doaj.art-365cc2c702354ec79bf6d23c3fe4a19b2024-01-10T08:28:58ZengIOP PublishingMachine Learning: Science and Technology2632-21532024-01-015101500310.1088/2632-2153/ad1626Synthetic pre-training for neural-network interatomic potentialsJohn L A Gardner0https://orcid.org/0009-0006-7377-7146Kathryn T Baker1Volker L Deringer2https://orcid.org/0000-0001-6873-0278Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford , Oxford OX1 3QR, United KingdomDepartment of Chemistry, Inorganic Chemistry Laboratory, University of Oxford , Oxford OX1 3QR, United KingdomDepartment of Chemistry, Inorganic Chemistry Laboratory, University of Oxford , Oxford OX1 3QR, United KingdomMachine learning (ML) based interatomic potentials have transformed the field of atomistic materials modelling. However, ML potentials depend critically on the quality and quantity of quantum-mechanical reference data with which they are trained, and therefore developing datasets and training pipelines is becoming an increasingly central challenge. Leveraging the idea of ‘synthetic’ (artificial) data that is common in other areas of ML research, we here show that synthetic atomistic data, themselves obtained at scale with an existing ML potential, constitute a useful pre-training task for neural-network (NN) interatomic potential models. Once pre-trained with a large synthetic dataset, these models can be fine-tuned on a much smaller, quantum-mechanical one, improving numerical accuracy and stability in computational practice. We demonstrate feasibility for a series of equivariant graph-NN potentials for carbon, and we carry out initial experiments to test the limits of the approach.https://doi.org/10.1088/2632-2153/ad1626machine learningneural networkssynthetic dataatomistic simulationsmolecular dynamics
spellingShingle John L A Gardner
Kathryn T Baker
Volker L Deringer
Synthetic pre-training for neural-network interatomic potentials
Machine Learning: Science and Technology
machine learning
neural networks
synthetic data
atomistic simulations
molecular dynamics
title Synthetic pre-training for neural-network interatomic potentials
title_full Synthetic pre-training for neural-network interatomic potentials
title_fullStr Synthetic pre-training for neural-network interatomic potentials
title_full_unstemmed Synthetic pre-training for neural-network interatomic potentials
title_short Synthetic pre-training for neural-network interatomic potentials
title_sort synthetic pre training for neural network interatomic potentials
topic machine learning
neural networks
synthetic data
atomistic simulations
molecular dynamics
url https://doi.org/10.1088/2632-2153/ad1626
work_keys_str_mv AT johnlagardner syntheticpretrainingforneuralnetworkinteratomicpotentials
AT kathryntbaker syntheticpretrainingforneuralnetworkinteratomicpotentials
AT volkerlderinger syntheticpretrainingforneuralnetworkinteratomicpotentials