Calibration of uncertainty in the active learning of machine learning force fields
FFLUX is a machine learning force field that uses the maximum expected prediction error (MEPE) active learning algorithm to improve the efficiency of model training. MEPE uses the predictive uncertainty of a Gaussian process (GP) to balance exploration and exploitation when selecting the next traini...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IOP Publishing
2023-01-01
|
Series: | Machine Learning: Science and Technology |
Subjects: | |
Online Access: | https://doi.org/10.1088/2632-2153/ad0ab5 |
_version_ | 1797505156438294528 |
---|---|
author | Adam Thomas-Mitchell Glenn Hawe Paul L A Popelier |
author_facet | Adam Thomas-Mitchell Glenn Hawe Paul L A Popelier |
author_sort | Adam Thomas-Mitchell |
collection | DOAJ |
description | FFLUX is a machine learning force field that uses the maximum expected prediction error (MEPE) active learning algorithm to improve the efficiency of model training. MEPE uses the predictive uncertainty of a Gaussian process (GP) to balance exploration and exploitation when selecting the next training sample. However, the predictive uncertainty of a GP is unlikely to be accurate or precise immediately after training. We hypothesize that calibrating the uncertainty quantification within MEPE will improve active learning performance. We develop and test two methods to improve uncertainty estimates: post-hoc calibration of predictive uncertainty using the CRUDE algorithm, and replacing the GP with a student- t process. We investigate the impact of these methods on MEPE for single sample and batch sample active learning. Our findings suggest that post-hoc calibration does not improve the performance of active learning using the MEPE method. However, we do find that the student- t process can outperform active learning strategies and random sampling using a GP if the training set is sufficiently large. |
first_indexed | 2024-03-10T04:14:35Z |
format | Article |
id | doaj.art-f47c0606007a4148ba0f1eed6f686799 |
institution | Directory Open Access Journal |
issn | 2632-2153 |
language | English |
last_indexed | 2024-03-10T04:14:35Z |
publishDate | 2023-01-01 |
publisher | IOP Publishing |
record_format | Article |
series | Machine Learning: Science and Technology |
spelling | doaj.art-f47c0606007a4148ba0f1eed6f6867992023-11-23T08:05:00ZengIOP PublishingMachine Learning: Science and Technology2632-21532023-01-014404503410.1088/2632-2153/ad0ab5Calibration of uncertainty in the active learning of machine learning force fieldsAdam Thomas-Mitchell0https://orcid.org/0009-0001-7977-5313Glenn Hawe1https://orcid.org/0000-0002-0590-8494Paul L A Popelier2https://orcid.org/0000-0001-9053-1363School of Computing, Ulster University , 2-24 York Street, BT15 1AP Belfast, United KingdomSchool of Computing, Ulster University , 2-24 York Street, BT15 1AP Belfast, United KingdomDepartment of Chemistry, The University of Manchester , Oxford Road, M13 9PL Manchester, United KingdomFFLUX is a machine learning force field that uses the maximum expected prediction error (MEPE) active learning algorithm to improve the efficiency of model training. MEPE uses the predictive uncertainty of a Gaussian process (GP) to balance exploration and exploitation when selecting the next training sample. However, the predictive uncertainty of a GP is unlikely to be accurate or precise immediately after training. We hypothesize that calibrating the uncertainty quantification within MEPE will improve active learning performance. We develop and test two methods to improve uncertainty estimates: post-hoc calibration of predictive uncertainty using the CRUDE algorithm, and replacing the GP with a student- t process. We investigate the impact of these methods on MEPE for single sample and batch sample active learning. Our findings suggest that post-hoc calibration does not improve the performance of active learning using the MEPE method. However, we do find that the student- t process can outperform active learning strategies and random sampling using a GP if the training set is sufficiently large.https://doi.org/10.1088/2632-2153/ad0ab5machine learning force fieldsGaussian processcalibrationactive learninguncertainty quantification |
spellingShingle | Adam Thomas-Mitchell Glenn Hawe Paul L A Popelier Calibration of uncertainty in the active learning of machine learning force fields Machine Learning: Science and Technology machine learning force fields Gaussian process calibration active learning uncertainty quantification |
title | Calibration of uncertainty in the active learning of machine learning force fields |
title_full | Calibration of uncertainty in the active learning of machine learning force fields |
title_fullStr | Calibration of uncertainty in the active learning of machine learning force fields |
title_full_unstemmed | Calibration of uncertainty in the active learning of machine learning force fields |
title_short | Calibration of uncertainty in the active learning of machine learning force fields |
title_sort | calibration of uncertainty in the active learning of machine learning force fields |
topic | machine learning force fields Gaussian process calibration active learning uncertainty quantification |
url | https://doi.org/10.1088/2632-2153/ad0ab5 |
work_keys_str_mv | AT adamthomasmitchell calibrationofuncertaintyintheactivelearningofmachinelearningforcefields AT glennhawe calibrationofuncertaintyintheactivelearningofmachinelearningforcefields AT paullapopelier calibrationofuncertaintyintheactivelearningofmachinelearningforcefields |