Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening

As patient health information is highly regulated due to privacy concerns, most machine learning (ML)-based healthcare studies are unable to test on external patient cohorts, resulting in a gap between locally reported model performance and cross-site generalizability. Different approaches have been...

תיאור מלא

מידע ביבליוגרפי
Main Authors: Yang, J, Soltan, AAS, Clifton, DA
פורמט: Journal article
שפה:English
יצא לאור: Springer Nature 2022
_version_ 1826311683659792384
author Yang, J
Soltan, AAS
Clifton, DA
author_facet Yang, J
Soltan, AAS
Clifton, DA
author_sort Yang, J
collection OXFORD
description As patient health information is highly regulated due to privacy concerns, most machine learning (ML)-based healthcare studies are unable to test on external patient cohorts, resulting in a gap between locally reported model performance and cross-site generalizability. Different approaches have been introduced for developing models across multiple clinical sites, however less attention has been given to adopting ready-made models in new settings. We introduce three methods to do this—(1) applying a ready-made model “as-is” (2); readjusting the decision threshold on the model’s output using site-specific data and (3); finetuning the model using site-specific data via transfer learning. Using a case study of COVID-19 diagnosis across four NHS Hospital Trusts, we show that all methods achieve clinically-effective performances (NPV > 0.959), with transfer learning achieving the best results (mean AUROCs between 0.870 and 0.925). Our models demonstrate that site-specific customization improves predictive performance when compared to other ready-made approaches.
first_indexed 2024-03-07T08:13:22Z
format Journal article
id oxford-uuid:eb1b2d14-e699-4455-beb2-325d9fa19ca8
institution University of Oxford
language English
last_indexed 2024-03-07T08:13:22Z
publishDate 2022
publisher Springer Nature
record_format dspace
spelling oxford-uuid:eb1b2d14-e699-4455-beb2-325d9fa19ca82023-12-05T10:54:24ZMachine learning generalizability across healthcare settings: insights from multi-site COVID-19 screeningJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:eb1b2d14-e699-4455-beb2-325d9fa19ca8EnglishSymplectic ElementsSpringer Nature2022Yang, JSoltan, AASClifton, DAAs patient health information is highly regulated due to privacy concerns, most machine learning (ML)-based healthcare studies are unable to test on external patient cohorts, resulting in a gap between locally reported model performance and cross-site generalizability. Different approaches have been introduced for developing models across multiple clinical sites, however less attention has been given to adopting ready-made models in new settings. We introduce three methods to do this—(1) applying a ready-made model “as-is” (2); readjusting the decision threshold on the model’s output using site-specific data and (3); finetuning the model using site-specific data via transfer learning. Using a case study of COVID-19 diagnosis across four NHS Hospital Trusts, we show that all methods achieve clinically-effective performances (NPV > 0.959), with transfer learning achieving the best results (mean AUROCs between 0.870 and 0.925). Our models demonstrate that site-specific customization improves predictive performance when compared to other ready-made approaches.
spellingShingle Yang, J
Soltan, AAS
Clifton, DA
Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
title Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
title_full Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
title_fullStr Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
title_full_unstemmed Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
title_short Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
title_sort machine learning generalizability across healthcare settings insights from multi site covid 19 screening
work_keys_str_mv AT yangj machinelearninggeneralizabilityacrosshealthcaresettingsinsightsfrommultisitecovid19screening
AT soltanaas machinelearninggeneralizabilityacrosshealthcaresettingsinsightsfrommultisitecovid19screening
AT cliftonda machinelearninggeneralizabilityacrosshealthcaresettingsinsightsfrommultisitecovid19screening