Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening

As patient health information is highly regulated due to privacy concerns, most machine learning (ML)-based healthcare studies are unable to test on external patient cohorts, resulting in a gap between locally reported model performance and cross-site generalizability. Different approaches have been...

תיאור מלא

מידע ביבליוגרפי
Main Authors:	Yang, J, Soltan, AAS, Clifton, DA
פורמט:	Journal article
שפה:	English
יצא לאור:	Springer Nature 2022

_version_	1826311683659792384
author	Yang, J Soltan, AAS Clifton, DA
author_facet	Yang, J Soltan, AAS Clifton, DA
author_sort	Yang, J
collection	OXFORD
description	As patient health information is highly regulated due to privacy concerns, most machine learning (ML)-based healthcare studies are unable to test on external patient cohorts, resulting in a gap between locally reported model performance and cross-site generalizability. Different approaches have been introduced for developing models across multiple clinical sites, however less attention has been given to adopting ready-made models in new settings. We introduce three methods to do this—(1) applying a ready-made model “as-is” (2); readjusting the decision threshold on the model’s output using site-specific data and (3); finetuning the model using site-specific data via transfer learning. Using a case study of COVID-19 diagnosis across four NHS Hospital Trusts, we show that all methods achieve clinically-effective performances (NPV > 0.959), with transfer learning achieving the best results (mean AUROCs between 0.870 and 0.925). Our models demonstrate that site-specific customization improves predictive performance when compared to other ready-made approaches.
first_indexed	2024-03-07T08:13:22Z
format	Journal article
id	oxford-uuid:eb1b2d14-e699-4455-beb2-325d9fa19ca8
institution	University of Oxford
language	English
last_indexed	2024-03-07T08:13:22Z
publishDate	2022
publisher	Springer Nature
record_format	dspace
spelling	oxford-uuid:eb1b2d14-e699-4455-beb2-325d9fa19ca82023-12-05T10:54:24ZMachine learning generalizability across healthcare settings: insights from multi-site COVID-19 screeningJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:eb1b2d14-e699-4455-beb2-325d9fa19ca8EnglishSymplectic ElementsSpringer Nature2022Yang, JSoltan, AASClifton, DAAs patient health information is highly regulated due to privacy concerns, most machine learning (ML)-based healthcare studies are unable to test on external patient cohorts, resulting in a gap between locally reported model performance and cross-site generalizability. Different approaches have been introduced for developing models across multiple clinical sites, however less attention has been given to adopting ready-made models in new settings. We introduce three methods to do this—(1) applying a ready-made model “as-is” (2); readjusting the decision threshold on the model’s output using site-specific data and (3); finetuning the model using site-specific data via transfer learning. Using a case study of COVID-19 diagnosis across four NHS Hospital Trusts, we show that all methods achieve clinically-effective performances (NPV > 0.959), with transfer learning achieving the best results (mean AUROCs between 0.870 and 0.925). Our models demonstrate that site-specific customization improves predictive performance when compared to other ready-made approaches.
spellingShingle	Yang, J Soltan, AAS Clifton, DA Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
title	Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
title_full	Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
title_fullStr	Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
title_full_unstemmed	Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
title_short	Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
title_sort	machine learning generalizability across healthcare settings insights from multi site covid 19 screening
work_keys_str_mv	AT yangj machinelearninggeneralizabilityacrosshealthcaresettingsinsightsfrommultisitecovid19screening AT soltanaas machinelearninggeneralizabilityacrosshealthcaresettingsinsightsfrommultisitecovid19screening AT cliftonda machinelearninggeneralizabilityacrosshealthcaresettingsinsightsfrommultisitecovid19screening

Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening

פריטים דומים