Comparing the Robustness of ResNet, Swin-Transformer, and MLP-Mixer under Unique Distribution Shifts in Fundus Images

Background: Diabetic retinopathy (DR) is the leading cause of visual impairment and blindness. Consequently, numerous deep learning models have been developed for the early detection of DR. Safety-critical applications employed in medical diagnosis must be robust to distribution shifts. Previous stu...

Full description

Bibliographic Details
Main Authors: Kazuaki Ishihara, Koutarou Matsumoto
Format: Article
Language:English
Published: MDPI AG 2023-12-01
Series:Bioengineering
Subjects:
Online Access:https://www.mdpi.com/2306-5354/10/12/1383
Description
Summary:Background: Diabetic retinopathy (DR) is the leading cause of visual impairment and blindness. Consequently, numerous deep learning models have been developed for the early detection of DR. Safety-critical applications employed in medical diagnosis must be robust to distribution shifts. Previous studies have focused on model performance under distribution shifts using natural image datasets such as ImageNet, CIFAR-10, and SVHN. However, there is a lack of research specifically investigating the performance using medical image datasets. To address this gap, we investigated trends under distribution shifts using fundus image datasets. Methods: We used the EyePACS dataset for DR diagnosis, introduced noise specific to fundus images, and evaluated the performance of ResNet, Swin-Transformer, and MLP-Mixer models under a distribution shift. The discriminative ability was evaluated using the Area Under the Receiver Operating Characteristic curve (ROC-AUC), while the calibration ability was evaluated using the monotonic sweep calibration error (ECE sweep). Results: Swin-Transformer exhibited a higher ROC-AUC than ResNet under all types of noise and displayed a smaller reduction in the ROC-AUC due to noise. ECE sweep did not show a consistent trend across different model architectures. Conclusions: Swin-Transformer consistently demonstrated superior discrimination compared to ResNet. This trend persisted even under unique distribution shifts in the fundus images.
ISSN:2306-5354