Technical and imaging factors influencing performance of deep learning systems for diabetic retinopathy

Abstract Deep learning (DL) has been shown to be effective in developing diabetic retinopathy (DR) algorithms, possibly tackling financial and manpower challenges hindering implementation of DR screening. However, our systematic review of the literature reveals few studies studied the impact of diff...

Full description

Bibliographic Details
Main Authors: Michelle Y. T. Yip, Gilbert Lim, Zhan Wei Lim, Quang D. Nguyen, Crystal C. Y. Chong, Marco Yu, Valentina Bellemo, Yuchen Xie, Xin Qi Lee, Haslina Hamzah, Jinyi Ho, Tien-En Tan, Charumathi Sabanayagam, Andrzej Grzybowski, Gavin S. W. Tan, Wynne Hsu, Mong Li Lee, Tien Yin Wong, Daniel S. W. Ting
Format: Article
Language:English
Published: Nature Portfolio 2020-03-01
Series:npj Digital Medicine
Online Access:https://doi.org/10.1038/s41746-020-0247-1
_version_ 1827608584291090432
author Michelle Y. T. Yip
Gilbert Lim
Zhan Wei Lim
Quang D. Nguyen
Crystal C. Y. Chong
Marco Yu
Valentina Bellemo
Yuchen Xie
Xin Qi Lee
Haslina Hamzah
Jinyi Ho
Tien-En Tan
Charumathi Sabanayagam
Andrzej Grzybowski
Gavin S. W. Tan
Wynne Hsu
Mong Li Lee
Tien Yin Wong
Daniel S. W. Ting
author_facet Michelle Y. T. Yip
Gilbert Lim
Zhan Wei Lim
Quang D. Nguyen
Crystal C. Y. Chong
Marco Yu
Valentina Bellemo
Yuchen Xie
Xin Qi Lee
Haslina Hamzah
Jinyi Ho
Tien-En Tan
Charumathi Sabanayagam
Andrzej Grzybowski
Gavin S. W. Tan
Wynne Hsu
Mong Li Lee
Tien Yin Wong
Daniel S. W. Ting
author_sort Michelle Y. T. Yip
collection DOAJ
description Abstract Deep learning (DL) has been shown to be effective in developing diabetic retinopathy (DR) algorithms, possibly tackling financial and manpower challenges hindering implementation of DR screening. However, our systematic review of the literature reveals few studies studied the impact of different factors on these DL algorithms, that are important for clinical deployment in real-world settings. Using 455,491 retinal images, we evaluated two technical and three image-related factors in detection of referable DR. For technical factors, the performances of four DL models (VGGNet, ResNet, DenseNet, Ensemble) and two computational frameworks (Caffe, TensorFlow) were evaluated while for image-related factors, we evaluated image compression levels (reducing image size, 350, 300, 250, 200, 150 KB), number of fields (7-field, 2-field, 1-field) and media clarity (pseudophakic vs phakic). In detection of referable DR, four DL models showed comparable diagnostic performance (AUC 0.936-0.944). To develop the VGGNet model, two computational frameworks had similar AUC (0.936). The DL performance dropped when image size decreased below 250 KB (AUC 0.936, 0.900, p < 0.001). The DL performance performed better when there were increased number of fields (dataset 1: 2-field vs 1-field—AUC 0.936 vs 0.908, p < 0.001; dataset 2: 7-field vs 2-field vs 1-field, AUC 0.949 vs 0.911 vs 0.895). DL performed better in the pseudophakic than phakic eyes (AUC 0.918 vs 0.833, p < 0.001). Various image-related factors play more significant roles than technical factors in determining the diagnostic performance, suggesting the importance of having robust training and testing datasets for DL training and deployment in the real-world settings.
first_indexed 2024-03-09T07:15:08Z
format Article
id doaj.art-465788d7e8eb44a1b4e7db09b3dfd067
institution Directory Open Access Journal
issn 2398-6352
language English
last_indexed 2024-03-09T07:15:08Z
publishDate 2020-03-01
publisher Nature Portfolio
record_format Article
series npj Digital Medicine
spelling doaj.art-465788d7e8eb44a1b4e7db09b3dfd0672023-12-03T08:33:29ZengNature Portfolionpj Digital Medicine2398-63522020-03-013111210.1038/s41746-020-0247-1Technical and imaging factors influencing performance of deep learning systems for diabetic retinopathyMichelle Y. T. Yip0Gilbert Lim1Zhan Wei Lim2Quang D. Nguyen3Crystal C. Y. Chong4Marco Yu5Valentina Bellemo6Yuchen Xie7Xin Qi Lee8Haslina Hamzah9Jinyi Ho10Tien-En Tan11Charumathi Sabanayagam12Andrzej Grzybowski13Gavin S. W. Tan14Wynne Hsu15Mong Li Lee16Tien Yin Wong17Daniel S. W. Ting18Singapore Eye Research Institute, Singapore National Eye CenterSingapore Eye Research Institute, Singapore National Eye CenterSchool of Computing, National University of SingaporeSingapore Eye Research Institute, Singapore National Eye CenterSingapore Eye Research Institute, Singapore National Eye CenterSingapore Eye Research Institute, Singapore National Eye CenterSingapore Eye Research Institute, Singapore National Eye CenterSingapore Eye Research Institute, Singapore National Eye CenterSingapore Eye Research Institute, Singapore National Eye CenterSingapore Eye Research Institute, Singapore National Eye CenterSingapore Eye Research Institute, Singapore National Eye CenterSingapore Eye Research Institute, Singapore National Eye CenterSingapore Eye Research Institute, Singapore National Eye CenterDepartment of Ophthalmology, University of Warmia and MazurySingapore Eye Research Institute, Singapore National Eye CenterSchool of Computing, National University of SingaporeSchool of Computing, National University of SingaporeSingapore Eye Research Institute, Singapore National Eye CenterSingapore Eye Research Institute, Singapore National Eye CenterAbstract Deep learning (DL) has been shown to be effective in developing diabetic retinopathy (DR) algorithms, possibly tackling financial and manpower challenges hindering implementation of DR screening. However, our systematic review of the literature reveals few studies studied the impact of different factors on these DL algorithms, that are important for clinical deployment in real-world settings. Using 455,491 retinal images, we evaluated two technical and three image-related factors in detection of referable DR. For technical factors, the performances of four DL models (VGGNet, ResNet, DenseNet, Ensemble) and two computational frameworks (Caffe, TensorFlow) were evaluated while for image-related factors, we evaluated image compression levels (reducing image size, 350, 300, 250, 200, 150 KB), number of fields (7-field, 2-field, 1-field) and media clarity (pseudophakic vs phakic). In detection of referable DR, four DL models showed comparable diagnostic performance (AUC 0.936-0.944). To develop the VGGNet model, two computational frameworks had similar AUC (0.936). The DL performance dropped when image size decreased below 250 KB (AUC 0.936, 0.900, p < 0.001). The DL performance performed better when there were increased number of fields (dataset 1: 2-field vs 1-field—AUC 0.936 vs 0.908, p < 0.001; dataset 2: 7-field vs 2-field vs 1-field, AUC 0.949 vs 0.911 vs 0.895). DL performed better in the pseudophakic than phakic eyes (AUC 0.918 vs 0.833, p < 0.001). Various image-related factors play more significant roles than technical factors in determining the diagnostic performance, suggesting the importance of having robust training and testing datasets for DL training and deployment in the real-world settings.https://doi.org/10.1038/s41746-020-0247-1
spellingShingle Michelle Y. T. Yip
Gilbert Lim
Zhan Wei Lim
Quang D. Nguyen
Crystal C. Y. Chong
Marco Yu
Valentina Bellemo
Yuchen Xie
Xin Qi Lee
Haslina Hamzah
Jinyi Ho
Tien-En Tan
Charumathi Sabanayagam
Andrzej Grzybowski
Gavin S. W. Tan
Wynne Hsu
Mong Li Lee
Tien Yin Wong
Daniel S. W. Ting
Technical and imaging factors influencing performance of deep learning systems for diabetic retinopathy
npj Digital Medicine
title Technical and imaging factors influencing performance of deep learning systems for diabetic retinopathy
title_full Technical and imaging factors influencing performance of deep learning systems for diabetic retinopathy
title_fullStr Technical and imaging factors influencing performance of deep learning systems for diabetic retinopathy
title_full_unstemmed Technical and imaging factors influencing performance of deep learning systems for diabetic retinopathy
title_short Technical and imaging factors influencing performance of deep learning systems for diabetic retinopathy
title_sort technical and imaging factors influencing performance of deep learning systems for diabetic retinopathy
url https://doi.org/10.1038/s41746-020-0247-1
work_keys_str_mv AT michelleytyip technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy
AT gilbertlim technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy
AT zhanweilim technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy
AT quangdnguyen technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy
AT crystalcychong technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy
AT marcoyu technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy
AT valentinabellemo technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy
AT yuchenxie technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy
AT xinqilee technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy
AT haslinahamzah technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy
AT jinyiho technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy
AT tienentan technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy
AT charumathisabanayagam technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy
AT andrzejgrzybowski technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy
AT gavinswtan technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy
AT wynnehsu technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy
AT monglilee technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy
AT tienyinwong technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy
AT danielswting technicalandimagingfactorsinfluencingperformanceofdeeplearningsystemsfordiabeticretinopathy