Comparing different versions of computer-aided detection products when reading chest X-rays for tuberculosis

Computer-aided detection (CAD) was recently recommended by the WHO for TB screening and triage based on several evaluations, but unlike traditional diagnostic tests, software versions are updated frequently and require constant evaluation. Since then, newer versions of two of the evaluated products...

Full description

Bibliographic Details
Main Authors: Zhi Zhen Qin, Rachael Barrett, Shahriar Ahmed, Mohammad Shahnewaz Sarker, Kishor Paul, Ahammad Shafiq Sikder Adel, Sayera Banu, Jacob Creswell
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2022-06-01
Series:PLOS Digital Health
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9931298/?tool=EBI
_version_ 1797698243288629248
author Zhi Zhen Qin
Rachael Barrett
Shahriar Ahmed
Mohammad Shahnewaz Sarker
Kishor Paul
Ahammad Shafiq Sikder Adel
Sayera Banu
Jacob Creswell
author_facet Zhi Zhen Qin
Rachael Barrett
Shahriar Ahmed
Mohammad Shahnewaz Sarker
Kishor Paul
Ahammad Shafiq Sikder Adel
Sayera Banu
Jacob Creswell
author_sort Zhi Zhen Qin
collection DOAJ
description Computer-aided detection (CAD) was recently recommended by the WHO for TB screening and triage based on several evaluations, but unlike traditional diagnostic tests, software versions are updated frequently and require constant evaluation. Since then, newer versions of two of the evaluated products have already been released. We used a case control sample of 12,890 chest X-rays to compare performance and model the programmatic effect of upgrading to newer versions of CAD4TB and qXR. We compared the area under the receiver operating characteristic curve (AUC), overall, and with data stratified by age, TB history, gender, and patient source. All versions were compared against radiologist readings and WHO’s Target Product Profile (TPP) for a TB triage test. Both newer versions significantly outperformed their predecessors in terms of AUC: CAD4TB version 6 (0.823 [0.816–0.830]), version 7 (0.903 [0.897–0.908]) and qXR version 2 (0.872 [0.866–0.878]), version 3 (0.906 [0.901–0.911]). Newer versions met WHO TPP values, older versions did not. All products equalled or surpassed the human radiologist performance with improvements in triage ability in newer versions. Humans and CAD performed worse in older age groups and among those with TB history. New versions of CAD outperform their predecessors. Prior to implementation CAD should be evaluated using local data because underlying neural networks can differ significantly. An independent rapid evaluation centre is necessitated to provide implementers with performance data on new versions of CAD products as they are developed. Author summary The World Health Organization recommended the use of artificial intelligence (AI)-powered computer-aided detection (CAD) for TB screening and triage in 2021. One year on, we comprehensively compare the performance of the newest versions of two CAD (CAD4TB and qXR) to their WHO-evaluated predecessors. We found that both newer versions significantly improved upon their predecessor’s ability to detect TB, performing better than the human readers. We also showed that the AI underlying new software versions can differ remarkably from the old and resemble an entirely new product altogether. We further demonstrate that, unlike laboratory diagnostic tools, CAD software updates could significantly impact the selection of appropriate threshold scores, the number of people with TB detected and cost-effectiveness. With newer CAD versions being rolled out almost annually, our results therefore underscore the need for rapid evidence generation to evaluate newer CAD versions in the fast-growing medical AI industry.
first_indexed 2024-03-12T03:51:21Z
format Article
id doaj.art-2fb9bae21b444bb88739ac1edc1e67b5
institution Directory Open Access Journal
issn 2767-3170
language English
last_indexed 2024-03-12T03:51:21Z
publishDate 2022-06-01
publisher Public Library of Science (PLoS)
record_format Article
series PLOS Digital Health
spelling doaj.art-2fb9bae21b444bb88739ac1edc1e67b52023-09-03T12:17:21ZengPublic Library of Science (PLoS)PLOS Digital Health2767-31702022-06-0116Comparing different versions of computer-aided detection products when reading chest X-rays for tuberculosisZhi Zhen QinRachael BarrettShahriar AhmedMohammad Shahnewaz SarkerKishor PaulAhammad Shafiq Sikder AdelSayera BanuJacob CreswellComputer-aided detection (CAD) was recently recommended by the WHO for TB screening and triage based on several evaluations, but unlike traditional diagnostic tests, software versions are updated frequently and require constant evaluation. Since then, newer versions of two of the evaluated products have already been released. We used a case control sample of 12,890 chest X-rays to compare performance and model the programmatic effect of upgrading to newer versions of CAD4TB and qXR. We compared the area under the receiver operating characteristic curve (AUC), overall, and with data stratified by age, TB history, gender, and patient source. All versions were compared against radiologist readings and WHO’s Target Product Profile (TPP) for a TB triage test. Both newer versions significantly outperformed their predecessors in terms of AUC: CAD4TB version 6 (0.823 [0.816–0.830]), version 7 (0.903 [0.897–0.908]) and qXR version 2 (0.872 [0.866–0.878]), version 3 (0.906 [0.901–0.911]). Newer versions met WHO TPP values, older versions did not. All products equalled or surpassed the human radiologist performance with improvements in triage ability in newer versions. Humans and CAD performed worse in older age groups and among those with TB history. New versions of CAD outperform their predecessors. Prior to implementation CAD should be evaluated using local data because underlying neural networks can differ significantly. An independent rapid evaluation centre is necessitated to provide implementers with performance data on new versions of CAD products as they are developed. Author summary The World Health Organization recommended the use of artificial intelligence (AI)-powered computer-aided detection (CAD) for TB screening and triage in 2021. One year on, we comprehensively compare the performance of the newest versions of two CAD (CAD4TB and qXR) to their WHO-evaluated predecessors. We found that both newer versions significantly improved upon their predecessor’s ability to detect TB, performing better than the human readers. We also showed that the AI underlying new software versions can differ remarkably from the old and resemble an entirely new product altogether. We further demonstrate that, unlike laboratory diagnostic tools, CAD software updates could significantly impact the selection of appropriate threshold scores, the number of people with TB detected and cost-effectiveness. With newer CAD versions being rolled out almost annually, our results therefore underscore the need for rapid evidence generation to evaluate newer CAD versions in the fast-growing medical AI industry.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9931298/?tool=EBI
spellingShingle Zhi Zhen Qin
Rachael Barrett
Shahriar Ahmed
Mohammad Shahnewaz Sarker
Kishor Paul
Ahammad Shafiq Sikder Adel
Sayera Banu
Jacob Creswell
Comparing different versions of computer-aided detection products when reading chest X-rays for tuberculosis
PLOS Digital Health
title Comparing different versions of computer-aided detection products when reading chest X-rays for tuberculosis
title_full Comparing different versions of computer-aided detection products when reading chest X-rays for tuberculosis
title_fullStr Comparing different versions of computer-aided detection products when reading chest X-rays for tuberculosis
title_full_unstemmed Comparing different versions of computer-aided detection products when reading chest X-rays for tuberculosis
title_short Comparing different versions of computer-aided detection products when reading chest X-rays for tuberculosis
title_sort comparing different versions of computer aided detection products when reading chest x rays for tuberculosis
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9931298/?tool=EBI
work_keys_str_mv AT zhizhenqin comparingdifferentversionsofcomputeraideddetectionproductswhenreadingchestxraysfortuberculosis
AT rachaelbarrett comparingdifferentversionsofcomputeraideddetectionproductswhenreadingchestxraysfortuberculosis
AT shahriarahmed comparingdifferentversionsofcomputeraideddetectionproductswhenreadingchestxraysfortuberculosis
AT mohammadshahnewazsarker comparingdifferentversionsofcomputeraideddetectionproductswhenreadingchestxraysfortuberculosis
AT kishorpaul comparingdifferentversionsofcomputeraideddetectionproductswhenreadingchestxraysfortuberculosis
AT ahammadshafiqsikderadel comparingdifferentversionsofcomputeraideddetectionproductswhenreadingchestxraysfortuberculosis
AT sayerabanu comparingdifferentversionsofcomputeraideddetectionproductswhenreadingchestxraysfortuberculosis
AT jacobcreswell comparingdifferentversionsofcomputeraideddetectionproductswhenreadingchestxraysfortuberculosis