An International Non-Inferiority Study for the Benchmarking of AI for Routine Radiology Cases: Chest X-ray, Fluorography and Mammography
An international reader study was conducted to gauge an average diagnostic accuracy of radiologists interpreting chest X-ray images, including those from fluorography and mammography, and establish requirements for stand-alone radiological artificial intelligence (AI) models. The retrospective studi...
Main Authors: | , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-06-01
|
Series: | Healthcare |
Subjects: | |
Online Access: | https://www.mdpi.com/2227-9032/11/12/1684 |
_version_ | 1797594607618359296 |
---|---|
author | Kirill Arzamasov Yuriy Vasilev Anton Vladzymyrskyy Olga Omelyanskaya Igor Shulkin Darya Kozikhina Inna Goncharova Pavel Gelezhe Yury Kirpichev Tatiana Bobrovskaya Anna Andreychenko |
author_facet | Kirill Arzamasov Yuriy Vasilev Anton Vladzymyrskyy Olga Omelyanskaya Igor Shulkin Darya Kozikhina Inna Goncharova Pavel Gelezhe Yury Kirpichev Tatiana Bobrovskaya Anna Andreychenko |
author_sort | Kirill Arzamasov |
collection | DOAJ |
description | An international reader study was conducted to gauge an average diagnostic accuracy of radiologists interpreting chest X-ray images, including those from fluorography and mammography, and establish requirements for stand-alone radiological artificial intelligence (AI) models. The retrospective studies in the datasets were labelled as containing or not containing target pathological findings based on a consensus of two experienced radiologists, and the results of a laboratory test and follow-up examination, where applicable. A total of 204 radiologists from 11 countries with various experience performed an assessment of the dataset with a 5-point Likert scale via a web platform. Eight commercial radiological AI models analyzed the same dataset. The AI AUROC was 0.87 (95% CI:0.83–0.9) versus 0.96 (95% CI 0.94–0.97) for radiologists. The sensitivity and specificity of AI versus radiologists were 0.71 (95% CI 0.64–0.78) versus 0.91 (95% CI 0.86–0.95) and 0.93 (95% CI 0.89–0.96) versus 0.9 (95% CI 0.85–0.94) for AI. The overall diagnostic accuracy of radiologists was superior to AI for chest X-ray and mammography. However, the accuracy of AI was noninferior to the least experienced radiologists for mammography and fluorography, and to all radiologists for chest X-ray. Therefore, an AI-based first reading could be recommended to reduce the workload burden of radiologists for the most common radiological studies such as chest X-ray and mammography. |
first_indexed | 2024-03-11T02:25:41Z |
format | Article |
id | doaj.art-b779faf62d0d481a8bee5662813367ad |
institution | Directory Open Access Journal |
issn | 2227-9032 |
language | English |
last_indexed | 2024-03-11T02:25:41Z |
publishDate | 2023-06-01 |
publisher | MDPI AG |
record_format | Article |
series | Healthcare |
spelling | doaj.art-b779faf62d0d481a8bee5662813367ad2023-11-18T10:37:42ZengMDPI AGHealthcare2227-90322023-06-011112168410.3390/healthcare11121684An International Non-Inferiority Study for the Benchmarking of AI for Routine Radiology Cases: Chest X-ray, Fluorography and MammographyKirill Arzamasov0Yuriy Vasilev1Anton Vladzymyrskyy2Olga Omelyanskaya3Igor Shulkin4Darya Kozikhina5Inna Goncharova6Pavel Gelezhe7Yury Kirpichev8Tatiana Bobrovskaya9Anna Andreychenko10State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaAn international reader study was conducted to gauge an average diagnostic accuracy of radiologists interpreting chest X-ray images, including those from fluorography and mammography, and establish requirements for stand-alone radiological artificial intelligence (AI) models. The retrospective studies in the datasets were labelled as containing or not containing target pathological findings based on a consensus of two experienced radiologists, and the results of a laboratory test and follow-up examination, where applicable. A total of 204 radiologists from 11 countries with various experience performed an assessment of the dataset with a 5-point Likert scale via a web platform. Eight commercial radiological AI models analyzed the same dataset. The AI AUROC was 0.87 (95% CI:0.83–0.9) versus 0.96 (95% CI 0.94–0.97) for radiologists. The sensitivity and specificity of AI versus radiologists were 0.71 (95% CI 0.64–0.78) versus 0.91 (95% CI 0.86–0.95) and 0.93 (95% CI 0.89–0.96) versus 0.9 (95% CI 0.85–0.94) for AI. The overall diagnostic accuracy of radiologists was superior to AI for chest X-ray and mammography. However, the accuracy of AI was noninferior to the least experienced radiologists for mammography and fluorography, and to all radiologists for chest X-ray. Therefore, an AI-based first reading could be recommended to reduce the workload burden of radiologists for the most common radiological studies such as chest X-ray and mammography.https://www.mdpi.com/2227-9032/11/12/1684stand-alone artificial intelligenceradiologybenchmarkingpopulation screening |
spellingShingle | Kirill Arzamasov Yuriy Vasilev Anton Vladzymyrskyy Olga Omelyanskaya Igor Shulkin Darya Kozikhina Inna Goncharova Pavel Gelezhe Yury Kirpichev Tatiana Bobrovskaya Anna Andreychenko An International Non-Inferiority Study for the Benchmarking of AI for Routine Radiology Cases: Chest X-ray, Fluorography and Mammography Healthcare stand-alone artificial intelligence radiology benchmarking population screening |
title | An International Non-Inferiority Study for the Benchmarking of AI for Routine Radiology Cases: Chest X-ray, Fluorography and Mammography |
title_full | An International Non-Inferiority Study for the Benchmarking of AI for Routine Radiology Cases: Chest X-ray, Fluorography and Mammography |
title_fullStr | An International Non-Inferiority Study for the Benchmarking of AI for Routine Radiology Cases: Chest X-ray, Fluorography and Mammography |
title_full_unstemmed | An International Non-Inferiority Study for the Benchmarking of AI for Routine Radiology Cases: Chest X-ray, Fluorography and Mammography |
title_short | An International Non-Inferiority Study for the Benchmarking of AI for Routine Radiology Cases: Chest X-ray, Fluorography and Mammography |
title_sort | international non inferiority study for the benchmarking of ai for routine radiology cases chest x ray fluorography and mammography |
topic | stand-alone artificial intelligence radiology benchmarking population screening |
url | https://www.mdpi.com/2227-9032/11/12/1684 |
work_keys_str_mv | AT kirillarzamasov aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT yuriyvasilev aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT antonvladzymyrskyy aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT olgaomelyanskaya aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT igorshulkin aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT daryakozikhina aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT innagoncharova aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT pavelgelezhe aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT yurykirpichev aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT tatianabobrovskaya aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT annaandreychenko aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT kirillarzamasov internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT yuriyvasilev internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT antonvladzymyrskyy internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT olgaomelyanskaya internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT igorshulkin internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT daryakozikhina internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT innagoncharova internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT pavelgelezhe internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT yurykirpichev internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT tatianabobrovskaya internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography AT annaandreychenko internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography |