Exploration of analysis methods for diagnostic imaging tests: problems with ROC AUC and confidence scores in CT colonography.

<h4>Background</h4>Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation o...

Full description

Bibliographic Details
Main Authors: Susan Mallett, Steve Halligan, Gary S Collins, Doug G Altman
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2014-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0107633
_version_ 1819033203907756032
author Susan Mallett
Steve Halligan
Gary S Collins
Doug G Altman
author_facet Susan Mallett
Steve Halligan
Gary S Collins
Doug G Altman
author_sort Susan Mallett
collection DOAJ
description <h4>Background</h4>Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection.<h4>Methods</h4>In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods.<h4>Results</h4>Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC.<h4>Conclusions</h4>The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests.
first_indexed 2024-12-21T07:14:07Z
format Article
id doaj.art-4b1b236ffd494459b205d7368d15310b
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-21T07:14:07Z
publishDate 2014-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-4b1b236ffd494459b205d7368d15310b2022-12-21T19:11:55ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-01910e10763310.1371/journal.pone.0107633Exploration of analysis methods for diagnostic imaging tests: problems with ROC AUC and confidence scores in CT colonography.Susan MallettSteve HalliganGary S CollinsDoug G Altman<h4>Background</h4>Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection.<h4>Methods</h4>In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods.<h4>Results</h4>Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC.<h4>Conclusions</h4>The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests.https://doi.org/10.1371/journal.pone.0107633
spellingShingle Susan Mallett
Steve Halligan
Gary S Collins
Doug G Altman
Exploration of analysis methods for diagnostic imaging tests: problems with ROC AUC and confidence scores in CT colonography.
PLoS ONE
title Exploration of analysis methods for diagnostic imaging tests: problems with ROC AUC and confidence scores in CT colonography.
title_full Exploration of analysis methods for diagnostic imaging tests: problems with ROC AUC and confidence scores in CT colonography.
title_fullStr Exploration of analysis methods for diagnostic imaging tests: problems with ROC AUC and confidence scores in CT colonography.
title_full_unstemmed Exploration of analysis methods for diagnostic imaging tests: problems with ROC AUC and confidence scores in CT colonography.
title_short Exploration of analysis methods for diagnostic imaging tests: problems with ROC AUC and confidence scores in CT colonography.
title_sort exploration of analysis methods for diagnostic imaging tests problems with roc auc and confidence scores in ct colonography
url https://doi.org/10.1371/journal.pone.0107633
work_keys_str_mv AT susanmallett explorationofanalysismethodsfordiagnosticimagingtestsproblemswithrocaucandconfidencescoresinctcolonography
AT stevehalligan explorationofanalysismethodsfordiagnosticimagingtestsproblemswithrocaucandconfidencescoresinctcolonography
AT garyscollins explorationofanalysismethodsfordiagnosticimagingtestsproblemswithrocaucandconfidencescoresinctcolonography
AT douggaltman explorationofanalysismethodsfordiagnosticimagingtestsproblemswithrocaucandconfidencescoresinctcolonography