Comparison of the Validity and Generalizability of Machine Learning Algorithms for the Prediction of Energy Expenditure: Validation Study

BackgroundAccurate solutions for the estimation of physical activity and energy expenditure at scale are needed for a range of medical and health research fields. Machine learning techniques show promise in research-grade accelerometers, and some evidence indicates that these...

Full description

Bibliographic Details
Main Authors:	Ruairi O'Driscoll, Jake Turicchi, Mark Hopkins, Cristiana Duarte, Graham W Horgan, Graham Finlayson, R James Stubbs
Format:	Article
Language:	English
Published:	JMIR Publications 2021-08-01
Series:	JMIR mHealth and uHealth
Online Access:	https://mhealth.jmir.org/2021/8/e23938

_version_	1827859164784754688
author	Ruairi O'Driscoll Jake Turicchi Mark Hopkins Cristiana Duarte Graham W Horgan Graham Finlayson R James Stubbs
author_facet	Ruairi O'Driscoll Jake Turicchi Mark Hopkins Cristiana Duarte Graham W Horgan Graham Finlayson R James Stubbs
author_sort	Ruairi O'Driscoll
collection	DOAJ
description	BackgroundAccurate solutions for the estimation of physical activity and energy expenditure at scale are needed for a range of medical and health research fields. Machine learning techniques show promise in research-grade accelerometers, and some evidence indicates that these techniques can be applied to more scalable commercial devices. ObjectiveThis study aims to test the validity and out-of-sample generalizability of algorithms for the prediction of energy expenditure in several wearables (ie, Fitbit Charge 2, ActiGraph GT3-x, SenseWear Armband Mini, and Polar H7) using two laboratory data sets comprising different activities. MethodsTwo laboratory studies (study 1: n=59, age 44.4 years, weight 75.7 kg; study 2: n=30, age=31.9 years, weight=70.6 kg), in which adult participants performed a sequential lab-based activity protocol consisting of resting, household, ambulatory, and nonambulatory tasks, were combined in this study. In both studies, accelerometer and physiological data were collected from the wearables alongside energy expenditure using indirect calorimetry. Three regression algorithms were used to predict metabolic equivalents (METs; ie, random forest, gradient boosting, and neural networks), and five classification algorithms (ie, k-nearest neighbor, support vector machine, random forest, gradient boosting, and neural networks) were used for physical activity intensity classification as sedentary, light, or moderate to vigorous. Algorithms were evaluated using leave-one-subject-out cross-validations and out-of-sample validations. ResultsThe root mean square error (RMSE) was lowest for gradient boosting applied to SenseWear and Polar H7 data (0.91 METs), and in the classification task, gradient boost applied to SenseWear and Polar H7 was the most accurate (85.5%). Fitbit models achieved an RMSE of 1.36 METs and 78.2% accuracy for classification. Errors tended to increase in out-of-sample validations with the SenseWear neural network achieving RMSE values of 1.22 METs in the regression tasks and the SenseWear gradient boost and random forest achieving an accuracy of 80% in classification tasks. ConclusionsAlgorithms trained on combined data sets demonstrated high predictive accuracy, with a tendency for superior performance of random forests and gradient boosting for most but not all wearable devices. Predictions were poorer in the between-study validations, which creates uncertainty regarding the generalizability of the tested algorithms.
first_indexed	2024-03-12T13:04:22Z
format	Article
id	doaj.art-69ca61dce1b24bb390a19b57bd88bd6d
institution	Directory Open Access Journal
issn	2291-5222
language	English
last_indexed	2024-03-12T13:04:22Z
publishDate	2021-08-01
publisher	JMIR Publications
record_format	Article
series	JMIR mHealth and uHealth
spelling	doaj.art-69ca61dce1b24bb390a19b57bd88bd6d2023-08-28T18:27:53ZengJMIR PublicationsJMIR mHealth and uHealth2291-52222021-08-0198e2393810.2196/23938Comparison of the Validity and Generalizability of Machine Learning Algorithms for the Prediction of Energy Expenditure: Validation StudyRuairi O'Driscollhttps://orcid.org/0000-0003-3995-0073Jake Turicchihttps://orcid.org/0000-0003-1174-813XMark Hopkinshttps://orcid.org/0000-0002-7655-0215Cristiana Duartehttps://orcid.org/0000-0002-6566-273XGraham W Horganhttps://orcid.org/0000-0002-6048-1374Graham Finlaysonhttps://orcid.org/0000-0002-5620-2256R James Stubbshttps://orcid.org/0000-0002-0843-9064 BackgroundAccurate solutions for the estimation of physical activity and energy expenditure at scale are needed for a range of medical and health research fields. Machine learning techniques show promise in research-grade accelerometers, and some evidence indicates that these techniques can be applied to more scalable commercial devices. ObjectiveThis study aims to test the validity and out-of-sample generalizability of algorithms for the prediction of energy expenditure in several wearables (ie, Fitbit Charge 2, ActiGraph GT3-x, SenseWear Armband Mini, and Polar H7) using two laboratory data sets comprising different activities. MethodsTwo laboratory studies (study 1: n=59, age 44.4 years, weight 75.7 kg; study 2: n=30, age=31.9 years, weight=70.6 kg), in which adult participants performed a sequential lab-based activity protocol consisting of resting, household, ambulatory, and nonambulatory tasks, were combined in this study. In both studies, accelerometer and physiological data were collected from the wearables alongside energy expenditure using indirect calorimetry. Three regression algorithms were used to predict metabolic equivalents (METs; ie, random forest, gradient boosting, and neural networks), and five classification algorithms (ie, k-nearest neighbor, support vector machine, random forest, gradient boosting, and neural networks) were used for physical activity intensity classification as sedentary, light, or moderate to vigorous. Algorithms were evaluated using leave-one-subject-out cross-validations and out-of-sample validations. ResultsThe root mean square error (RMSE) was lowest for gradient boosting applied to SenseWear and Polar H7 data (0.91 METs), and in the classification task, gradient boost applied to SenseWear and Polar H7 was the most accurate (85.5%). Fitbit models achieved an RMSE of 1.36 METs and 78.2% accuracy for classification. Errors tended to increase in out-of-sample validations with the SenseWear neural network achieving RMSE values of 1.22 METs in the regression tasks and the SenseWear gradient boost and random forest achieving an accuracy of 80% in classification tasks. ConclusionsAlgorithms trained on combined data sets demonstrated high predictive accuracy, with a tendency for superior performance of random forests and gradient boosting for most but not all wearable devices. Predictions were poorer in the between-study validations, which creates uncertainty regarding the generalizability of the tested algorithms.https://mhealth.jmir.org/2021/8/e23938
spellingShingle	Ruairi O'Driscoll Jake Turicchi Mark Hopkins Cristiana Duarte Graham W Horgan Graham Finlayson R James Stubbs Comparison of the Validity and Generalizability of Machine Learning Algorithms for the Prediction of Energy Expenditure: Validation Study JMIR mHealth and uHealth
title	Comparison of the Validity and Generalizability of Machine Learning Algorithms for the Prediction of Energy Expenditure: Validation Study
title_full	Comparison of the Validity and Generalizability of Machine Learning Algorithms for the Prediction of Energy Expenditure: Validation Study
title_fullStr	Comparison of the Validity and Generalizability of Machine Learning Algorithms for the Prediction of Energy Expenditure: Validation Study
title_full_unstemmed	Comparison of the Validity and Generalizability of Machine Learning Algorithms for the Prediction of Energy Expenditure: Validation Study
title_short	Comparison of the Validity and Generalizability of Machine Learning Algorithms for the Prediction of Energy Expenditure: Validation Study
title_sort	comparison of the validity and generalizability of machine learning algorithms for the prediction of energy expenditure validation study
url	https://mhealth.jmir.org/2021/8/e23938
work_keys_str_mv	AT ruairiodriscoll comparisonofthevalidityandgeneralizabilityofmachinelearningalgorithmsforthepredictionofenergyexpenditurevalidationstudy AT jaketuricchi comparisonofthevalidityandgeneralizabilityofmachinelearningalgorithmsforthepredictionofenergyexpenditurevalidationstudy AT markhopkins comparisonofthevalidityandgeneralizabilityofmachinelearningalgorithmsforthepredictionofenergyexpenditurevalidationstudy AT cristianaduarte comparisonofthevalidityandgeneralizabilityofmachinelearningalgorithmsforthepredictionofenergyexpenditurevalidationstudy AT grahamwhorgan comparisonofthevalidityandgeneralizabilityofmachinelearningalgorithmsforthepredictionofenergyexpenditurevalidationstudy AT grahamfinlayson comparisonofthevalidityandgeneralizabilityofmachinelearningalgorithmsforthepredictionofenergyexpenditurevalidationstudy AT rjamesstubbs comparisonofthevalidityandgeneralizabilityofmachinelearningalgorithmsforthepredictionofenergyexpenditurevalidationstudy

Comparison of the Validity and Generalizability of Machine Learning Algorithms for the Prediction of Energy Expenditure: Validation Study

Similar Items