An International Non-Inferiority Study for the Benchmarking of AI for Routine Radiology Cases: Chest X-ray, Fluorography and Mammography

An international reader study was conducted to gauge an average diagnostic accuracy of radiologists interpreting chest X-ray images, including those from fluorography and mammography, and establish requirements for stand-alone radiological artificial intelligence (AI) models. The retrospective studi...

Full description

Bibliographic Details
Main Authors: Kirill Arzamasov, Yuriy Vasilev, Anton Vladzymyrskyy, Olga Omelyanskaya, Igor Shulkin, Darya Kozikhina, Inna Goncharova, Pavel Gelezhe, Yury Kirpichev, Tatiana Bobrovskaya, Anna Andreychenko
Format: Article
Language:English
Published: MDPI AG 2023-06-01
Series:Healthcare
Subjects:
Online Access:https://www.mdpi.com/2227-9032/11/12/1684
_version_ 1797594607618359296
author Kirill Arzamasov
Yuriy Vasilev
Anton Vladzymyrskyy
Olga Omelyanskaya
Igor Shulkin
Darya Kozikhina
Inna Goncharova
Pavel Gelezhe
Yury Kirpichev
Tatiana Bobrovskaya
Anna Andreychenko
author_facet Kirill Arzamasov
Yuriy Vasilev
Anton Vladzymyrskyy
Olga Omelyanskaya
Igor Shulkin
Darya Kozikhina
Inna Goncharova
Pavel Gelezhe
Yury Kirpichev
Tatiana Bobrovskaya
Anna Andreychenko
author_sort Kirill Arzamasov
collection DOAJ
description An international reader study was conducted to gauge an average diagnostic accuracy of radiologists interpreting chest X-ray images, including those from fluorography and mammography, and establish requirements for stand-alone radiological artificial intelligence (AI) models. The retrospective studies in the datasets were labelled as containing or not containing target pathological findings based on a consensus of two experienced radiologists, and the results of a laboratory test and follow-up examination, where applicable. A total of 204 radiologists from 11 countries with various experience performed an assessment of the dataset with a 5-point Likert scale via a web platform. Eight commercial radiological AI models analyzed the same dataset. The AI AUROC was 0.87 (95% CI:0.83–0.9) versus 0.96 (95% CI 0.94–0.97) for radiologists. The sensitivity and specificity of AI versus radiologists were 0.71 (95% CI 0.64–0.78) versus 0.91 (95% CI 0.86–0.95) and 0.93 (95% CI 0.89–0.96) versus 0.9 (95% CI 0.85–0.94) for AI. The overall diagnostic accuracy of radiologists was superior to AI for chest X-ray and mammography. However, the accuracy of AI was noninferior to the least experienced radiologists for mammography and fluorography, and to all radiologists for chest X-ray. Therefore, an AI-based first reading could be recommended to reduce the workload burden of radiologists for the most common radiological studies such as chest X-ray and mammography.
first_indexed 2024-03-11T02:25:41Z
format Article
id doaj.art-b779faf62d0d481a8bee5662813367ad
institution Directory Open Access Journal
issn 2227-9032
language English
last_indexed 2024-03-11T02:25:41Z
publishDate 2023-06-01
publisher MDPI AG
record_format Article
series Healthcare
spelling doaj.art-b779faf62d0d481a8bee5662813367ad2023-11-18T10:37:42ZengMDPI AGHealthcare2227-90322023-06-011112168410.3390/healthcare11121684An International Non-Inferiority Study for the Benchmarking of AI for Routine Radiology Cases: Chest X-ray, Fluorography and MammographyKirill Arzamasov0Yuriy Vasilev1Anton Vladzymyrskyy2Olga Omelyanskaya3Igor Shulkin4Darya Kozikhina5Inna Goncharova6Pavel Gelezhe7Yury Kirpichev8Tatiana Bobrovskaya9Anna Andreychenko10State Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaState Budget-Funded Health Care Institution of the City of Moscow “Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department”, Petrovka Street, 24, Building 1, 127051 Moscow, RussiaAn international reader study was conducted to gauge an average diagnostic accuracy of radiologists interpreting chest X-ray images, including those from fluorography and mammography, and establish requirements for stand-alone radiological artificial intelligence (AI) models. The retrospective studies in the datasets were labelled as containing or not containing target pathological findings based on a consensus of two experienced radiologists, and the results of a laboratory test and follow-up examination, where applicable. A total of 204 radiologists from 11 countries with various experience performed an assessment of the dataset with a 5-point Likert scale via a web platform. Eight commercial radiological AI models analyzed the same dataset. The AI AUROC was 0.87 (95% CI:0.83–0.9) versus 0.96 (95% CI 0.94–0.97) for radiologists. The sensitivity and specificity of AI versus radiologists were 0.71 (95% CI 0.64–0.78) versus 0.91 (95% CI 0.86–0.95) and 0.93 (95% CI 0.89–0.96) versus 0.9 (95% CI 0.85–0.94) for AI. The overall diagnostic accuracy of radiologists was superior to AI for chest X-ray and mammography. However, the accuracy of AI was noninferior to the least experienced radiologists for mammography and fluorography, and to all radiologists for chest X-ray. Therefore, an AI-based first reading could be recommended to reduce the workload burden of radiologists for the most common radiological studies such as chest X-ray and mammography.https://www.mdpi.com/2227-9032/11/12/1684stand-alone artificial intelligenceradiologybenchmarkingpopulation screening
spellingShingle Kirill Arzamasov
Yuriy Vasilev
Anton Vladzymyrskyy
Olga Omelyanskaya
Igor Shulkin
Darya Kozikhina
Inna Goncharova
Pavel Gelezhe
Yury Kirpichev
Tatiana Bobrovskaya
Anna Andreychenko
An International Non-Inferiority Study for the Benchmarking of AI for Routine Radiology Cases: Chest X-ray, Fluorography and Mammography
Healthcare
stand-alone artificial intelligence
radiology
benchmarking
population screening
title An International Non-Inferiority Study for the Benchmarking of AI for Routine Radiology Cases: Chest X-ray, Fluorography and Mammography
title_full An International Non-Inferiority Study for the Benchmarking of AI for Routine Radiology Cases: Chest X-ray, Fluorography and Mammography
title_fullStr An International Non-Inferiority Study for the Benchmarking of AI for Routine Radiology Cases: Chest X-ray, Fluorography and Mammography
title_full_unstemmed An International Non-Inferiority Study for the Benchmarking of AI for Routine Radiology Cases: Chest X-ray, Fluorography and Mammography
title_short An International Non-Inferiority Study for the Benchmarking of AI for Routine Radiology Cases: Chest X-ray, Fluorography and Mammography
title_sort international non inferiority study for the benchmarking of ai for routine radiology cases chest x ray fluorography and mammography
topic stand-alone artificial intelligence
radiology
benchmarking
population screening
url https://www.mdpi.com/2227-9032/11/12/1684
work_keys_str_mv AT kirillarzamasov aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT yuriyvasilev aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT antonvladzymyrskyy aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT olgaomelyanskaya aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT igorshulkin aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT daryakozikhina aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT innagoncharova aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT pavelgelezhe aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT yurykirpichev aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT tatianabobrovskaya aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT annaandreychenko aninternationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT kirillarzamasov internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT yuriyvasilev internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT antonvladzymyrskyy internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT olgaomelyanskaya internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT igorshulkin internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT daryakozikhina internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT innagoncharova internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT pavelgelezhe internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT yurykirpichev internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT tatianabobrovskaya internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography
AT annaandreychenko internationalnoninferioritystudyforthebenchmarkingofaiforroutineradiologycaseschestxrayfluorographyandmammography