A generalizable deep learning regression model for automated glaucoma screening from fundus images

Abstract A plethora of classification models for the detection of glaucoma from fundus images have been proposed in recent years. Often trained with data from a single glaucoma clinic, they report impressive performance on internal test sets, but tend to struggle in generalizing to external sets. Th...

Full description

Bibliographic Details
Main Authors: Ruben Hemelings, Bart Elen, Alexander K. Schuster, Matthew B. Blaschko, João Barbosa-Breda, Pekko Hujanen, Annika Junglas, Stefan Nickels, Andrew White, Norbert Pfeiffer, Paul Mitchell, Patrick De Boever, Anja Tuulonen, Ingeborg Stalmans
Format: Article
Language:English
Published: Nature Portfolio 2023-06-01
Series:npj Digital Medicine
Online Access:https://doi.org/10.1038/s41746-023-00857-0
_version_ 1797425797043060736
author Ruben Hemelings
Bart Elen
Alexander K. Schuster
Matthew B. Blaschko
João Barbosa-Breda
Pekko Hujanen
Annika Junglas
Stefan Nickels
Andrew White
Norbert Pfeiffer
Paul Mitchell
Patrick De Boever
Anja Tuulonen
Ingeborg Stalmans
author_facet Ruben Hemelings
Bart Elen
Alexander K. Schuster
Matthew B. Blaschko
João Barbosa-Breda
Pekko Hujanen
Annika Junglas
Stefan Nickels
Andrew White
Norbert Pfeiffer
Paul Mitchell
Patrick De Boever
Anja Tuulonen
Ingeborg Stalmans
author_sort Ruben Hemelings
collection DOAJ
description Abstract A plethora of classification models for the detection of glaucoma from fundus images have been proposed in recent years. Often trained with data from a single glaucoma clinic, they report impressive performance on internal test sets, but tend to struggle in generalizing to external sets. This performance drop can be attributed to data shifts in glaucoma prevalence, fundus camera, and the definition of glaucoma ground truth. In this study, we confirm that a previously described regression network for glaucoma referral (G-RISK) obtains excellent results in a variety of challenging settings. Thirteen different data sources of labeled fundus images were utilized. The data sources include two large population cohorts (Australian Blue Mountains Eye Study, BMES and German Gutenberg Health Study, GHS) and 11 publicly available datasets (AIROGS, ORIGA, REFUGE1, LAG, ODIR, REFUGE2, GAMMA, RIM-ONEr3, RIM-ONE DL, ACRIMA, PAPILA). To minimize data shifts in input data, a standardized image processing strategy was developed to obtain 30° disc-centered images from the original data. A total of 149,455 images were included for model testing. Area under the receiver operating characteristic curve (AUC) for BMES and GHS population cohorts were at 0.976 [95% CI: 0.967–0.986] and 0.984 [95% CI: 0.980–0.991] on participant level, respectively. At a fixed specificity of 95%, sensitivities were at 87.3% and 90.3%, respectively, surpassing the minimum criteria of 85% sensitivity recommended by Prevent Blindness America. AUC values on the eleven publicly available data sets ranged from 0.854 to 0.988. These results confirm the excellent generalizability of a glaucoma risk regression model trained with homogeneous data from a single tertiary referral center. Further validation using prospective cohort studies is warranted.
first_indexed 2024-03-09T08:21:23Z
format Article
id doaj.art-186fb58fa7c147dca61e4bdcee0903bb
institution Directory Open Access Journal
issn 2398-6352
language English
last_indexed 2024-03-09T08:21:23Z
publishDate 2023-06-01
publisher Nature Portfolio
record_format Article
series npj Digital Medicine
spelling doaj.art-186fb58fa7c147dca61e4bdcee0903bb2023-12-02T21:44:56ZengNature Portfolionpj Digital Medicine2398-63522023-06-016111510.1038/s41746-023-00857-0A generalizable deep learning regression model for automated glaucoma screening from fundus imagesRuben Hemelings0Bart Elen1Alexander K. Schuster2Matthew B. Blaschko3João Barbosa-Breda4Pekko Hujanen5Annika Junglas6Stefan Nickels7Andrew White8Norbert Pfeiffer9Paul Mitchell10Patrick De Boever11Anja Tuulonen12Ingeborg Stalmans13Research Group Ophthalmology, Department of Neurosciences, KU LeuvenFlemish Institute for Technological Research (VITO)Department of Ophthalmology, University Medical Center MainzESAT-PSI, KU LeuvenResearch Group Ophthalmology, Department of Neurosciences, KU LeuvenTays Eye Centre, Tampere University HospitalDepartment of Ophthalmology, University Medical Center MainzDepartment of Ophthalmology, University Medical Center MainzDepartment of Ophthalmology, The University of SydneyDepartment of Ophthalmology, University Medical Center MainzDepartment of Ophthalmology, The University of SydneyCentre for Environmental Sciences, Hasselt University, Agoralaan building DTays Eye Centre, Tampere University HospitalResearch Group Ophthalmology, Department of Neurosciences, KU LeuvenAbstract A plethora of classification models for the detection of glaucoma from fundus images have been proposed in recent years. Often trained with data from a single glaucoma clinic, they report impressive performance on internal test sets, but tend to struggle in generalizing to external sets. This performance drop can be attributed to data shifts in glaucoma prevalence, fundus camera, and the definition of glaucoma ground truth. In this study, we confirm that a previously described regression network for glaucoma referral (G-RISK) obtains excellent results in a variety of challenging settings. Thirteen different data sources of labeled fundus images were utilized. The data sources include two large population cohorts (Australian Blue Mountains Eye Study, BMES and German Gutenberg Health Study, GHS) and 11 publicly available datasets (AIROGS, ORIGA, REFUGE1, LAG, ODIR, REFUGE2, GAMMA, RIM-ONEr3, RIM-ONE DL, ACRIMA, PAPILA). To minimize data shifts in input data, a standardized image processing strategy was developed to obtain 30° disc-centered images from the original data. A total of 149,455 images were included for model testing. Area under the receiver operating characteristic curve (AUC) for BMES and GHS population cohorts were at 0.976 [95% CI: 0.967–0.986] and 0.984 [95% CI: 0.980–0.991] on participant level, respectively. At a fixed specificity of 95%, sensitivities were at 87.3% and 90.3%, respectively, surpassing the minimum criteria of 85% sensitivity recommended by Prevent Blindness America. AUC values on the eleven publicly available data sets ranged from 0.854 to 0.988. These results confirm the excellent generalizability of a glaucoma risk regression model trained with homogeneous data from a single tertiary referral center. Further validation using prospective cohort studies is warranted.https://doi.org/10.1038/s41746-023-00857-0
spellingShingle Ruben Hemelings
Bart Elen
Alexander K. Schuster
Matthew B. Blaschko
João Barbosa-Breda
Pekko Hujanen
Annika Junglas
Stefan Nickels
Andrew White
Norbert Pfeiffer
Paul Mitchell
Patrick De Boever
Anja Tuulonen
Ingeborg Stalmans
A generalizable deep learning regression model for automated glaucoma screening from fundus images
npj Digital Medicine
title A generalizable deep learning regression model for automated glaucoma screening from fundus images
title_full A generalizable deep learning regression model for automated glaucoma screening from fundus images
title_fullStr A generalizable deep learning regression model for automated glaucoma screening from fundus images
title_full_unstemmed A generalizable deep learning regression model for automated glaucoma screening from fundus images
title_short A generalizable deep learning regression model for automated glaucoma screening from fundus images
title_sort generalizable deep learning regression model for automated glaucoma screening from fundus images
url https://doi.org/10.1038/s41746-023-00857-0
work_keys_str_mv AT rubenhemelings ageneralizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT bartelen ageneralizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT alexanderkschuster ageneralizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT matthewbblaschko ageneralizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT joaobarbosabreda ageneralizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT pekkohujanen ageneralizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT annikajunglas ageneralizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT stefannickels ageneralizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT andrewwhite ageneralizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT norbertpfeiffer ageneralizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT paulmitchell ageneralizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT patrickdeboever ageneralizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT anjatuulonen ageneralizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT ingeborgstalmans ageneralizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT rubenhemelings generalizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT bartelen generalizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT alexanderkschuster generalizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT matthewbblaschko generalizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT joaobarbosabreda generalizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT pekkohujanen generalizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT annikajunglas generalizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT stefannickels generalizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT andrewwhite generalizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT norbertpfeiffer generalizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT paulmitchell generalizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT patrickdeboever generalizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT anjatuulonen generalizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages
AT ingeborgstalmans generalizabledeeplearningregressionmodelforautomatedglaucomascreeningfromfundusimages