Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
As patient health information is highly regulated due to privacy concerns, most machine learning (ML)-based healthcare studies are unable to test on external patient cohorts, resulting in a gap between locally reported model performance and cross-site generalizability. Different approaches have been...
Main Authors: | , , |
---|---|
פורמט: | Journal article |
שפה: | English |
יצא לאור: |
Springer Nature
2022
|
_version_ | 1826311683659792384 |
---|---|
author | Yang, J Soltan, AAS Clifton, DA |
author_facet | Yang, J Soltan, AAS Clifton, DA |
author_sort | Yang, J |
collection | OXFORD |
description | As patient health information is highly regulated due to privacy concerns, most machine learning (ML)-based healthcare studies are unable to test on external patient cohorts, resulting in a gap between locally reported model performance and cross-site generalizability. Different approaches have been introduced for developing models across multiple clinical sites, however less attention has been given to adopting ready-made models in new settings. We introduce three methods to do this—(1) applying a ready-made model “as-is” (2); readjusting the decision threshold on the model’s output using site-specific data and (3); finetuning the model using site-specific data via transfer learning. Using a case study of COVID-19 diagnosis across four NHS Hospital Trusts, we show that all methods achieve clinically-effective performances (NPV > 0.959), with transfer learning achieving the best results (mean AUROCs between 0.870 and 0.925). Our models demonstrate that site-specific customization improves predictive performance when compared to other ready-made approaches. |
first_indexed | 2024-03-07T08:13:22Z |
format | Journal article |
id | oxford-uuid:eb1b2d14-e699-4455-beb2-325d9fa19ca8 |
institution | University of Oxford |
language | English |
last_indexed | 2024-03-07T08:13:22Z |
publishDate | 2022 |
publisher | Springer Nature |
record_format | dspace |
spelling | oxford-uuid:eb1b2d14-e699-4455-beb2-325d9fa19ca82023-12-05T10:54:24ZMachine learning generalizability across healthcare settings: insights from multi-site COVID-19 screeningJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:eb1b2d14-e699-4455-beb2-325d9fa19ca8EnglishSymplectic ElementsSpringer Nature2022Yang, JSoltan, AASClifton, DAAs patient health information is highly regulated due to privacy concerns, most machine learning (ML)-based healthcare studies are unable to test on external patient cohorts, resulting in a gap between locally reported model performance and cross-site generalizability. Different approaches have been introduced for developing models across multiple clinical sites, however less attention has been given to adopting ready-made models in new settings. We introduce three methods to do this—(1) applying a ready-made model “as-is” (2); readjusting the decision threshold on the model’s output using site-specific data and (3); finetuning the model using site-specific data via transfer learning. Using a case study of COVID-19 diagnosis across four NHS Hospital Trusts, we show that all methods achieve clinically-effective performances (NPV > 0.959), with transfer learning achieving the best results (mean AUROCs between 0.870 and 0.925). Our models demonstrate that site-specific customization improves predictive performance when compared to other ready-made approaches. |
spellingShingle | Yang, J Soltan, AAS Clifton, DA Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening |
title | Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening |
title_full | Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening |
title_fullStr | Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening |
title_full_unstemmed | Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening |
title_short | Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening |
title_sort | machine learning generalizability across healthcare settings insights from multi site covid 19 screening |
work_keys_str_mv | AT yangj machinelearninggeneralizabilityacrosshealthcaresettingsinsightsfrommultisitecovid19screening AT soltanaas machinelearninggeneralizabilityacrosshealthcaresettingsinsightsfrommultisitecovid19screening AT cliftonda machinelearninggeneralizabilityacrosshealthcaresettingsinsightsfrommultisitecovid19screening |