Are deep models in radiomics performing better than generic models? A systematic review
Abstract Background Application of radiomics proceeds by extracting and analysing imaging features based on generic morphological, textural, and statistical features defined by formulas. Recently, deep learning methods were applied. It is unclear whether deep models (DMs) can outperform generic mode...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2023-03-01
|
Series: | European Radiology Experimental |
Subjects: | |
Online Access: | https://doi.org/10.1186/s41747-023-00325-0 |
_version_ | 1797865642314956800 |
---|---|
author | Aydin Demircioğlu |
author_facet | Aydin Demircioğlu |
author_sort | Aydin Demircioğlu |
collection | DOAJ |
description | Abstract Background Application of radiomics proceeds by extracting and analysing imaging features based on generic morphological, textural, and statistical features defined by formulas. Recently, deep learning methods were applied. It is unclear whether deep models (DMs) can outperform generic models (GMs). Methods We identified publications on PubMed and Embase to determine differences between DMs and GMs in terms of receiver operating area under the curve (AUC). Results Of 1,229 records (between 2017 and 2021), 69 studies were included, 61 (88%) on tumours, 68 (99%) retrospective, and 39 (56%) single centre; 30 (43%) used an internal validation cohort; and 18 (26%) applied cross-validation. Studies with independent internal cohort had a median training sample of 196 (range 41–1,455); those with cross-validation had only 133 (43–1,426). Median size of validation cohorts was 73 (18–535) for internal and 94 (18–388) for external. Considering the internal validation, in 74% (49/66), the DMs performed better than the GMs, vice versa in 20% (13/66); no difference in 6% (4/66); and median difference in AUC 0.045. On the external validation, DMs were better in 65% (13/20), GMs in 20% (4/20) cases; no difference in 3 (15%); and median difference in AUC 0.025. On internal validation, fused models outperformed GMs and DMs in 72% (20/28), while they were worse in 14% (4/28) and equal in 14% (4/28); median gain in AUC was + 0.02. On external validation, fused model performed better in 63% (5/8), worse in 25% (2/8), and equal in 13% (1/8); median gain in AUC was + 0.025. Conclusions Overall, DMs outperformed GMs but in 26% of the studies, DMs did not outperform GMs. |
first_indexed | 2024-04-09T23:11:24Z |
format | Article |
id | doaj.art-f377a14cd0ab486bacbc7ac118b22413 |
institution | Directory Open Access Journal |
issn | 2509-9280 |
language | English |
last_indexed | 2024-04-09T23:11:24Z |
publishDate | 2023-03-01 |
publisher | SpringerOpen |
record_format | Article |
series | European Radiology Experimental |
spelling | doaj.art-f377a14cd0ab486bacbc7ac118b224132023-03-22T10:22:49ZengSpringerOpenEuropean Radiology Experimental2509-92802023-03-017111410.1186/s41747-023-00325-0Are deep models in radiomics performing better than generic models? A systematic reviewAydin Demircioğlu0Institute of Diagnostic and Interventional Radiology and Neuroradiology, University Hospital EssenAbstract Background Application of radiomics proceeds by extracting and analysing imaging features based on generic morphological, textural, and statistical features defined by formulas. Recently, deep learning methods were applied. It is unclear whether deep models (DMs) can outperform generic models (GMs). Methods We identified publications on PubMed and Embase to determine differences between DMs and GMs in terms of receiver operating area under the curve (AUC). Results Of 1,229 records (between 2017 and 2021), 69 studies were included, 61 (88%) on tumours, 68 (99%) retrospective, and 39 (56%) single centre; 30 (43%) used an internal validation cohort; and 18 (26%) applied cross-validation. Studies with independent internal cohort had a median training sample of 196 (range 41–1,455); those with cross-validation had only 133 (43–1,426). Median size of validation cohorts was 73 (18–535) for internal and 94 (18–388) for external. Considering the internal validation, in 74% (49/66), the DMs performed better than the GMs, vice versa in 20% (13/66); no difference in 6% (4/66); and median difference in AUC 0.045. On the external validation, DMs were better in 65% (13/20), GMs in 20% (4/20) cases; no difference in 3 (15%); and median difference in AUC 0.025. On internal validation, fused models outperformed GMs and DMs in 72% (20/28), while they were worse in 14% (4/28) and equal in 14% (4/28); median gain in AUC was + 0.02. On external validation, fused model performed better in 63% (5/8), worse in 25% (2/8), and equal in 13% (1/8); median gain in AUC was + 0.025. Conclusions Overall, DMs outperformed GMs but in 26% of the studies, DMs did not outperform GMs.https://doi.org/10.1186/s41747-023-00325-0Artificial intelligenceDeep learningMachine learningRadiologyRadiomics |
spellingShingle | Aydin Demircioğlu Are deep models in radiomics performing better than generic models? A systematic review European Radiology Experimental Artificial intelligence Deep learning Machine learning Radiology Radiomics |
title | Are deep models in radiomics performing better than generic models? A systematic review |
title_full | Are deep models in radiomics performing better than generic models? A systematic review |
title_fullStr | Are deep models in radiomics performing better than generic models? A systematic review |
title_full_unstemmed | Are deep models in radiomics performing better than generic models? A systematic review |
title_short | Are deep models in radiomics performing better than generic models? A systematic review |
title_sort | are deep models in radiomics performing better than generic models a systematic review |
topic | Artificial intelligence Deep learning Machine learning Radiology Radiomics |
url | https://doi.org/10.1186/s41747-023-00325-0 |
work_keys_str_mv | AT aydindemircioglu aredeepmodelsinradiomicsperformingbetterthangenericmodelsasystematicreview |