Calibration of uncertainty in the active learning of machine learning force fields

FFLUX is a machine learning force field that uses the maximum expected prediction error (MEPE) active learning algorithm to improve the efficiency of model training. MEPE uses the predictive uncertainty of a Gaussian process (GP) to balance exploration and exploitation when selecting the next traini...

Full description

Bibliographic Details
Main Authors:	Adam Thomas-Mitchell, Glenn Hawe, Paul L A Popelier
Format:	Article
Language:	English
Published:	IOP Publishing 2023-01-01
Series:	Machine Learning: Science and Technology
Subjects:	machine learning force fields Gaussian process calibration active learning uncertainty quantification
Online Access:	https://doi.org/10.1088/2632-2153/ad0ab5

_version_	1797505156438294528
author	Adam Thomas-Mitchell Glenn Hawe Paul L A Popelier
author_facet	Adam Thomas-Mitchell Glenn Hawe Paul L A Popelier
author_sort	Adam Thomas-Mitchell
collection	DOAJ
description	FFLUX is a machine learning force field that uses the maximum expected prediction error (MEPE) active learning algorithm to improve the efficiency of model training. MEPE uses the predictive uncertainty of a Gaussian process (GP) to balance exploration and exploitation when selecting the next training sample. However, the predictive uncertainty of a GP is unlikely to be accurate or precise immediately after training. We hypothesize that calibrating the uncertainty quantification within MEPE will improve active learning performance. We develop and test two methods to improve uncertainty estimates: post-hoc calibration of predictive uncertainty using the CRUDE algorithm, and replacing the GP with a student- t process. We investigate the impact of these methods on MEPE for single sample and batch sample active learning. Our findings suggest that post-hoc calibration does not improve the performance of active learning using the MEPE method. However, we do find that the student- t process can outperform active learning strategies and random sampling using a GP if the training set is sufficiently large.
first_indexed	2024-03-10T04:14:35Z
format	Article
id	doaj.art-f47c0606007a4148ba0f1eed6f686799
institution	Directory Open Access Journal
issn	2632-2153
language	English
last_indexed	2024-03-10T04:14:35Z
publishDate	2023-01-01
publisher	IOP Publishing
record_format	Article
series	Machine Learning: Science and Technology
spelling	doaj.art-f47c0606007a4148ba0f1eed6f6867992023-11-23T08:05:00ZengIOP PublishingMachine Learning: Science and Technology2632-21532023-01-014404503410.1088/2632-2153/ad0ab5Calibration of uncertainty in the active learning of machine learning force fieldsAdam Thomas-Mitchell0https://orcid.org/0009-0001-7977-5313Glenn Hawe1https://orcid.org/0000-0002-0590-8494Paul L A Popelier2https://orcid.org/0000-0001-9053-1363School of Computing, Ulster University , 2-24 York Street, BT15 1AP Belfast, United KingdomSchool of Computing, Ulster University , 2-24 York Street, BT15 1AP Belfast, United KingdomDepartment of Chemistry, The University of Manchester , Oxford Road, M13 9PL Manchester, United KingdomFFLUX is a machine learning force field that uses the maximum expected prediction error (MEPE) active learning algorithm to improve the efficiency of model training. MEPE uses the predictive uncertainty of a Gaussian process (GP) to balance exploration and exploitation when selecting the next training sample. However, the predictive uncertainty of a GP is unlikely to be accurate or precise immediately after training. We hypothesize that calibrating the uncertainty quantification within MEPE will improve active learning performance. We develop and test two methods to improve uncertainty estimates: post-hoc calibration of predictive uncertainty using the CRUDE algorithm, and replacing the GP with a student- t process. We investigate the impact of these methods on MEPE for single sample and batch sample active learning. Our findings suggest that post-hoc calibration does not improve the performance of active learning using the MEPE method. However, we do find that the student- t process can outperform active learning strategies and random sampling using a GP if the training set is sufficiently large.https://doi.org/10.1088/2632-2153/ad0ab5machine learning force fieldsGaussian processcalibrationactive learninguncertainty quantification
spellingShingle	Adam Thomas-Mitchell Glenn Hawe Paul L A Popelier Calibration of uncertainty in the active learning of machine learning force fields Machine Learning: Science and Technology machine learning force fields Gaussian process calibration active learning uncertainty quantification
title	Calibration of uncertainty in the active learning of machine learning force fields
title_full	Calibration of uncertainty in the active learning of machine learning force fields
title_fullStr	Calibration of uncertainty in the active learning of machine learning force fields
title_full_unstemmed	Calibration of uncertainty in the active learning of machine learning force fields
title_short	Calibration of uncertainty in the active learning of machine learning force fields
title_sort	calibration of uncertainty in the active learning of machine learning force fields
topic	machine learning force fields Gaussian process calibration active learning uncertainty quantification
url	https://doi.org/10.1088/2632-2153/ad0ab5
work_keys_str_mv	AT adamthomasmitchell calibrationofuncertaintyintheactivelearningofmachinelearningforcefields AT glennhawe calibrationofuncertaintyintheactivelearningofmachinelearningforcefields AT paullapopelier calibrationofuncertaintyintheactivelearningofmachinelearningforcefields

Calibration of uncertainty in the active learning of machine learning force fields

Similar Items