Reliability of crowdsourced data and patient-reported outcome measures in cough-based COVID-19 screening
Abstract Mass community testing is a critical means for monitoring the spread of the COVID-19 pandemic. Polymerase chain reaction (PCR) is the gold standard for detecting the causative coronavirus 2 (SARS-CoV-2) but the test is invasive, test centers may not be readily available, and the wait for la...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2022-12-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-022-26492-5 |
_version_ | 1797977486974255104 |
---|---|
author | Hao Xiong Shlomo Berkovsky Mohamed Ali Kâafar Adam Jaffe Enrico Coiera Roneel V. Sharan |
author_facet | Hao Xiong Shlomo Berkovsky Mohamed Ali Kâafar Adam Jaffe Enrico Coiera Roneel V. Sharan |
author_sort | Hao Xiong |
collection | DOAJ |
description | Abstract Mass community testing is a critical means for monitoring the spread of the COVID-19 pandemic. Polymerase chain reaction (PCR) is the gold standard for detecting the causative coronavirus 2 (SARS-CoV-2) but the test is invasive, test centers may not be readily available, and the wait for laboratory results can take several days. Various machine learning based alternatives to PCR screening for SARS-CoV-2 have been proposed, including cough sound analysis. Cough classification models appear to be a robust means to predict infective status, but collecting reliable PCR confirmed data for their development is challenging and recent work using unverified crowdsourced data is seen as a viable alternative. In this study, we report experiments that assess cough classification models trained (i) using data from PCR-confirmed COVID subjects and (ii) using data of individuals self-reporting their infective status. We compare performance using PCR-confirmed data. Models trained on PCR-confirmed data perform better than those trained on patient-reported data. Models using PCR-confirmed data also exploit more stable predictive features and converge faster. Crowd-sourced cough data is less reliable than PCR-confirmed data for developing predictive models for COVID-19, and raises concerns about the utility of patient reported outcome data in developing other clinical predictive models when better gold-standard data are available. |
first_indexed | 2024-04-11T05:07:41Z |
format | Article |
id | doaj.art-9b8d731cd5eb460cbd5a479c8c82631a |
institution | Directory Open Access Journal |
issn | 2045-2322 |
language | English |
last_indexed | 2024-04-11T05:07:41Z |
publishDate | 2022-12-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj.art-9b8d731cd5eb460cbd5a479c8c82631a2022-12-25T12:16:24ZengNature PortfolioScientific Reports2045-23222022-12-011211910.1038/s41598-022-26492-5Reliability of crowdsourced data and patient-reported outcome measures in cough-based COVID-19 screeningHao Xiong0Shlomo Berkovsky1Mohamed Ali Kâafar2Adam Jaffe3Enrico Coiera4Roneel V. Sharan5Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie UniversityCentre for Health Informatics, Australian Institute of Health Innovation, Macquarie UniversityDepartment of Computing, Macquarie UniversitySchool of Women’s and Children’s Health, Faculty of Medicine, University of New South WalesCentre for Health Informatics, Australian Institute of Health Innovation, Macquarie UniversityCentre for Health Informatics, Australian Institute of Health Innovation, Macquarie UniversityAbstract Mass community testing is a critical means for monitoring the spread of the COVID-19 pandemic. Polymerase chain reaction (PCR) is the gold standard for detecting the causative coronavirus 2 (SARS-CoV-2) but the test is invasive, test centers may not be readily available, and the wait for laboratory results can take several days. Various machine learning based alternatives to PCR screening for SARS-CoV-2 have been proposed, including cough sound analysis. Cough classification models appear to be a robust means to predict infective status, but collecting reliable PCR confirmed data for their development is challenging and recent work using unverified crowdsourced data is seen as a viable alternative. In this study, we report experiments that assess cough classification models trained (i) using data from PCR-confirmed COVID subjects and (ii) using data of individuals self-reporting their infective status. We compare performance using PCR-confirmed data. Models trained on PCR-confirmed data perform better than those trained on patient-reported data. Models using PCR-confirmed data also exploit more stable predictive features and converge faster. Crowd-sourced cough data is less reliable than PCR-confirmed data for developing predictive models for COVID-19, and raises concerns about the utility of patient reported outcome data in developing other clinical predictive models when better gold-standard data are available.https://doi.org/10.1038/s41598-022-26492-5 |
spellingShingle | Hao Xiong Shlomo Berkovsky Mohamed Ali Kâafar Adam Jaffe Enrico Coiera Roneel V. Sharan Reliability of crowdsourced data and patient-reported outcome measures in cough-based COVID-19 screening Scientific Reports |
title | Reliability of crowdsourced data and patient-reported outcome measures in cough-based COVID-19 screening |
title_full | Reliability of crowdsourced data and patient-reported outcome measures in cough-based COVID-19 screening |
title_fullStr | Reliability of crowdsourced data and patient-reported outcome measures in cough-based COVID-19 screening |
title_full_unstemmed | Reliability of crowdsourced data and patient-reported outcome measures in cough-based COVID-19 screening |
title_short | Reliability of crowdsourced data and patient-reported outcome measures in cough-based COVID-19 screening |
title_sort | reliability of crowdsourced data and patient reported outcome measures in cough based covid 19 screening |
url | https://doi.org/10.1038/s41598-022-26492-5 |
work_keys_str_mv | AT haoxiong reliabilityofcrowdsourceddataandpatientreportedoutcomemeasuresincoughbasedcovid19screening AT shlomoberkovsky reliabilityofcrowdsourceddataandpatientreportedoutcomemeasuresincoughbasedcovid19screening AT mohamedalikaafar reliabilityofcrowdsourceddataandpatientreportedoutcomemeasuresincoughbasedcovid19screening AT adamjaffe reliabilityofcrowdsourceddataandpatientreportedoutcomemeasuresincoughbasedcovid19screening AT enricocoiera reliabilityofcrowdsourceddataandpatientreportedoutcomemeasuresincoughbasedcovid19screening AT roneelvsharan reliabilityofcrowdsourceddataandpatientreportedoutcomemeasuresincoughbasedcovid19screening |