Reliability of radiologists’ first impression when interpreting a screening mammogram

Previous studies showed that radiologists can detect the gist of an abnormality in a mammogram based on a half-second image presentation through global processing of screening mammograms. This study investigated the intra- and inter-observer reliability of the radiologists’ initial impressions about...

Full description

Bibliographic Details
Main Authors: Ziba Gandomkar, Somphone Siviengphanom, Mo’ayyad Suleiman, Dennis Wong, Warren Reed, Ernest U. Ekpo, Dong Xu, Sarah J. Lewis, Karla K. Evans, Jeremy M. Wolfe, Patrick C. Brennan
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2023-01-01
Series:PLoS ONE
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10128970/?tool=EBI
_version_ 1797837895581564928
author Ziba Gandomkar
Somphone Siviengphanom
Mo’ayyad Suleiman
Dennis Wong
Warren Reed
Ernest U. Ekpo
Dong Xu
Sarah J. Lewis
Karla K. Evans
Jeremy M. Wolfe
Patrick C. Brennan
author_facet Ziba Gandomkar
Somphone Siviengphanom
Mo’ayyad Suleiman
Dennis Wong
Warren Reed
Ernest U. Ekpo
Dong Xu
Sarah J. Lewis
Karla K. Evans
Jeremy M. Wolfe
Patrick C. Brennan
author_sort Ziba Gandomkar
collection DOAJ
description Previous studies showed that radiologists can detect the gist of an abnormality in a mammogram based on a half-second image presentation through global processing of screening mammograms. This study investigated the intra- and inter-observer reliability of the radiologists’ initial impressions about the abnormality (or "gist signal"). It also examined if a subset of radiologists produced more reliable and accurate gist signals. Thirty-nine radiologists provided their initial impressions on two separate occasions, viewing each mammogram for half a second each time. The intra-class correlation (ICC) values showed poor to moderate intra-reader reliability. Only 13 radiologists had an ICC of 0.6 or above, which is considered the minimum standard for reliability, and only three radiologists had an ICC exceeding 0.7. The median value for the weighted Cohen’s Kappa was 0.478 (interquartile range = 0.419–0.555). The Mann-Whitney U-test showed that the "Gist Experts", defined as those who outperformed others, had significantly higher ICC values (p = 0.002) and weighted Cohen’s Kappa scores (p = 0.026). However, even for these experts, the intra-radiologist agreements were not strong, as an ICC of at least 0.75 indicates good reliability and the signal from none of the readers reached this level of reliability as determined by ICC values. The inter-reader reliability of the gist signal was poor, with an ICC score of 0.31 (CI = 0.26–0.37). The Fleiss Kappa score of 0.106 (CI = 0.105–0.106), indicating only slight inter-reader agreement, confirms the findings from the ICC analysis. The intra- and inter-reader reliability analysis showed that the radiologists’ initial impressions are not reliable signals. In particular, the absence of an abnormal gist does not reliably signal a normal case, so radiologists should keep searching. This highlights the importance of "discovery scanning," or coarse screening to detect potential targets before ending the visual search.
first_indexed 2024-04-09T15:32:02Z
format Article
id doaj.art-364c6eed0c9d484f931f4304ec40aead
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-04-09T15:32:02Z
publishDate 2023-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-364c6eed0c9d484f931f4304ec40aead2023-04-28T05:31:55ZengPublic Library of Science (PLoS)PLoS ONE1932-62032023-01-01184Reliability of radiologists’ first impression when interpreting a screening mammogramZiba GandomkarSomphone SiviengphanomMo’ayyad SuleimanDennis WongWarren ReedErnest U. EkpoDong XuSarah J. LewisKarla K. EvansJeremy M. WolfePatrick C. BrennanPrevious studies showed that radiologists can detect the gist of an abnormality in a mammogram based on a half-second image presentation through global processing of screening mammograms. This study investigated the intra- and inter-observer reliability of the radiologists’ initial impressions about the abnormality (or "gist signal"). It also examined if a subset of radiologists produced more reliable and accurate gist signals. Thirty-nine radiologists provided their initial impressions on two separate occasions, viewing each mammogram for half a second each time. The intra-class correlation (ICC) values showed poor to moderate intra-reader reliability. Only 13 radiologists had an ICC of 0.6 or above, which is considered the minimum standard for reliability, and only three radiologists had an ICC exceeding 0.7. The median value for the weighted Cohen’s Kappa was 0.478 (interquartile range = 0.419–0.555). The Mann-Whitney U-test showed that the "Gist Experts", defined as those who outperformed others, had significantly higher ICC values (p = 0.002) and weighted Cohen’s Kappa scores (p = 0.026). However, even for these experts, the intra-radiologist agreements were not strong, as an ICC of at least 0.75 indicates good reliability and the signal from none of the readers reached this level of reliability as determined by ICC values. The inter-reader reliability of the gist signal was poor, with an ICC score of 0.31 (CI = 0.26–0.37). The Fleiss Kappa score of 0.106 (CI = 0.105–0.106), indicating only slight inter-reader agreement, confirms the findings from the ICC analysis. The intra- and inter-reader reliability analysis showed that the radiologists’ initial impressions are not reliable signals. In particular, the absence of an abnormal gist does not reliably signal a normal case, so radiologists should keep searching. This highlights the importance of "discovery scanning," or coarse screening to detect potential targets before ending the visual search.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10128970/?tool=EBI
spellingShingle Ziba Gandomkar
Somphone Siviengphanom
Mo’ayyad Suleiman
Dennis Wong
Warren Reed
Ernest U. Ekpo
Dong Xu
Sarah J. Lewis
Karla K. Evans
Jeremy M. Wolfe
Patrick C. Brennan
Reliability of radiologists’ first impression when interpreting a screening mammogram
PLoS ONE
title Reliability of radiologists’ first impression when interpreting a screening mammogram
title_full Reliability of radiologists’ first impression when interpreting a screening mammogram
title_fullStr Reliability of radiologists’ first impression when interpreting a screening mammogram
title_full_unstemmed Reliability of radiologists’ first impression when interpreting a screening mammogram
title_short Reliability of radiologists’ first impression when interpreting a screening mammogram
title_sort reliability of radiologists first impression when interpreting a screening mammogram
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10128970/?tool=EBI
work_keys_str_mv AT zibagandomkar reliabilityofradiologistsfirstimpressionwheninterpretingascreeningmammogram
AT somphonesiviengphanom reliabilityofradiologistsfirstimpressionwheninterpretingascreeningmammogram
AT moayyadsuleiman reliabilityofradiologistsfirstimpressionwheninterpretingascreeningmammogram
AT denniswong reliabilityofradiologistsfirstimpressionwheninterpretingascreeningmammogram
AT warrenreed reliabilityofradiologistsfirstimpressionwheninterpretingascreeningmammogram
AT ernestuekpo reliabilityofradiologistsfirstimpressionwheninterpretingascreeningmammogram
AT dongxu reliabilityofradiologistsfirstimpressionwheninterpretingascreeningmammogram
AT sarahjlewis reliabilityofradiologistsfirstimpressionwheninterpretingascreeningmammogram
AT karlakevans reliabilityofradiologistsfirstimpressionwheninterpretingascreeningmammogram
AT jeremymwolfe reliabilityofradiologistsfirstimpressionwheninterpretingascreeningmammogram
AT patrickcbrennan reliabilityofradiologistsfirstimpressionwheninterpretingascreeningmammogram