Are deep models in radiomics performing better than generic models? A systematic review

Abstract Background Application of radiomics proceeds by extracting and analysing imaging features based on generic morphological, textural, and statistical features defined by formulas. Recently, deep learning methods were applied. It is unclear whether deep models (DMs) can outperform generic mode...

Full description

Bibliographic Details
Main Author: Aydin Demircioğlu
Format: Article
Language:English
Published: SpringerOpen 2023-03-01
Series:European Radiology Experimental
Subjects:
Online Access:https://doi.org/10.1186/s41747-023-00325-0
_version_ 1797865642314956800
author Aydin Demircioğlu
author_facet Aydin Demircioğlu
author_sort Aydin Demircioğlu
collection DOAJ
description Abstract Background Application of radiomics proceeds by extracting and analysing imaging features based on generic morphological, textural, and statistical features defined by formulas. Recently, deep learning methods were applied. It is unclear whether deep models (DMs) can outperform generic models (GMs). Methods We identified publications on PubMed and Embase to determine differences between DMs and GMs in terms of receiver operating area under the curve (AUC). Results Of 1,229 records (between 2017 and 2021), 69 studies were included, 61 (88%) on tumours, 68 (99%) retrospective, and 39 (56%) single centre; 30 (43%) used an internal validation cohort; and 18 (26%) applied cross-validation. Studies with independent internal cohort had a median training sample of 196 (range 41–1,455); those with cross-validation had only 133 (43–1,426). Median size of validation cohorts was 73 (18–535) for internal and 94 (18–388) for external. Considering the internal validation, in 74% (49/66), the DMs performed better than the GMs, vice versa in 20% (13/66); no difference in 6% (4/66); and median difference in AUC 0.045. On the external validation, DMs were better in 65% (13/20), GMs in 20% (4/20) cases; no difference in 3 (15%); and median difference in AUC 0.025. On internal validation, fused models outperformed GMs and DMs in 72% (20/28), while they were worse in 14% (4/28) and equal in 14% (4/28); median gain in AUC was + 0.02. On external validation, fused model performed better in 63% (5/8), worse in 25% (2/8), and equal in 13% (1/8); median gain in AUC was + 0.025. Conclusions Overall, DMs outperformed GMs but in 26% of the studies, DMs did not outperform GMs.
first_indexed 2024-04-09T23:11:24Z
format Article
id doaj.art-f377a14cd0ab486bacbc7ac118b22413
institution Directory Open Access Journal
issn 2509-9280
language English
last_indexed 2024-04-09T23:11:24Z
publishDate 2023-03-01
publisher SpringerOpen
record_format Article
series European Radiology Experimental
spelling doaj.art-f377a14cd0ab486bacbc7ac118b224132023-03-22T10:22:49ZengSpringerOpenEuropean Radiology Experimental2509-92802023-03-017111410.1186/s41747-023-00325-0Are deep models in radiomics performing better than generic models? A systematic reviewAydin Demircioğlu0Institute of Diagnostic and Interventional Radiology and Neuroradiology, University Hospital EssenAbstract Background Application of radiomics proceeds by extracting and analysing imaging features based on generic morphological, textural, and statistical features defined by formulas. Recently, deep learning methods were applied. It is unclear whether deep models (DMs) can outperform generic models (GMs). Methods We identified publications on PubMed and Embase to determine differences between DMs and GMs in terms of receiver operating area under the curve (AUC). Results Of 1,229 records (between 2017 and 2021), 69 studies were included, 61 (88%) on tumours, 68 (99%) retrospective, and 39 (56%) single centre; 30 (43%) used an internal validation cohort; and 18 (26%) applied cross-validation. Studies with independent internal cohort had a median training sample of 196 (range 41–1,455); those with cross-validation had only 133 (43–1,426). Median size of validation cohorts was 73 (18–535) for internal and 94 (18–388) for external. Considering the internal validation, in 74% (49/66), the DMs performed better than the GMs, vice versa in 20% (13/66); no difference in 6% (4/66); and median difference in AUC 0.045. On the external validation, DMs were better in 65% (13/20), GMs in 20% (4/20) cases; no difference in 3 (15%); and median difference in AUC 0.025. On internal validation, fused models outperformed GMs and DMs in 72% (20/28), while they were worse in 14% (4/28) and equal in 14% (4/28); median gain in AUC was + 0.02. On external validation, fused model performed better in 63% (5/8), worse in 25% (2/8), and equal in 13% (1/8); median gain in AUC was + 0.025. Conclusions Overall, DMs outperformed GMs but in 26% of the studies, DMs did not outperform GMs.https://doi.org/10.1186/s41747-023-00325-0Artificial intelligenceDeep learningMachine learningRadiologyRadiomics
spellingShingle Aydin Demircioğlu
Are deep models in radiomics performing better than generic models? A systematic review
European Radiology Experimental
Artificial intelligence
Deep learning
Machine learning
Radiology
Radiomics
title Are deep models in radiomics performing better than generic models? A systematic review
title_full Are deep models in radiomics performing better than generic models? A systematic review
title_fullStr Are deep models in radiomics performing better than generic models? A systematic review
title_full_unstemmed Are deep models in radiomics performing better than generic models? A systematic review
title_short Are deep models in radiomics performing better than generic models? A systematic review
title_sort are deep models in radiomics performing better than generic models a systematic review
topic Artificial intelligence
Deep learning
Machine learning
Radiology
Radiomics
url https://doi.org/10.1186/s41747-023-00325-0
work_keys_str_mv AT aydindemircioglu aredeepmodelsinradiomicsperformingbetterthangenericmodelsasystematicreview