Accuracy and self-validation of automated bone age determination

Abstract The BoneXpert method for automated determination of bone age from hand X-rays was introduced in 2009 and is currently running in over 200 hospitals. The aim of this work is to present version 3 of the method and validate its accuracy and self-validation mechanism that automatically rejects...

Full description

Bibliographic Details
Main Authors: D. D. Martin, A. D. Calder, M. B. Ranke, G. Binder, H. H. Thodberg
Format: Article
Language:English
Published: Nature Portfolio 2022-04-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-022-10292-y
_version_ 1818274095426437120
author D. D. Martin
A. D. Calder
M. B. Ranke
G. Binder
H. H. Thodberg
author_facet D. D. Martin
A. D. Calder
M. B. Ranke
G. Binder
H. H. Thodberg
author_sort D. D. Martin
collection DOAJ
description Abstract The BoneXpert method for automated determination of bone age from hand X-rays was introduced in 2009 and is currently running in over 200 hospitals. The aim of this work is to present version 3 of the method and validate its accuracy and self-validation mechanism that automatically rejects an image if it is at risk of being analysed incorrectly. The training set included 14,036 images from the 2017 Radiological Society of North America (RSNA) Bone Age Challenge, 1642 images of normal Dutch and Californian children, and 8250 images from Tübingen from patients with Short Stature, Congenital Adrenal Hyperplasia and Precocious Puberty. The study resulted in a cross-validated root mean square (RMS) error in the Tübingen images of 0.62 y, compared to 0.72 y in the previous version. The RMS error on the RSNA test set of 200 images was 0.45 y relative to the average of six manual ratings. The self-validation mechanism rejected 0.4% of the RSNA images. 121 outliers among the self-validated images of the Tübingen study were rerated, resulting in 6 cases where BoneXpert deviated more than 1.5 years from the average of the three re-ratings, compared to 72 such cases for the original manual ratings. The accuracy of BoneXpert is clearly better than the accuracy of a single manual rating. The self-validation mechanism rejected very few images, typically with abnormal anatomy, and among the accepted images, there were 12 times fewer severe bone age errors than in manual ratings, suggesting that BoneXpert could be safer than manual rating.
first_indexed 2024-12-12T22:08:24Z
format Article
id doaj.art-0d1299e747ad494cbd8d9c9adc2d5800
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-12-12T22:08:24Z
publishDate 2022-04-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-0d1299e747ad494cbd8d9c9adc2d58002022-12-22T00:10:18ZengNature PortfolioScientific Reports2045-23222022-04-0112111210.1038/s41598-022-10292-yAccuracy and self-validation of automated bone age determinationD. D. Martin0A. D. Calder1M. B. Ranke2G. Binder3H. H. Thodberg4University of Witten/HerdeckeGreat Ormond Street Hospital for Children NHS Foundation TrustPediatric Endocrinology, University Children’s HospitalPediatric Endocrinology, University Children’s HospitalVisianaAbstract The BoneXpert method for automated determination of bone age from hand X-rays was introduced in 2009 and is currently running in over 200 hospitals. The aim of this work is to present version 3 of the method and validate its accuracy and self-validation mechanism that automatically rejects an image if it is at risk of being analysed incorrectly. The training set included 14,036 images from the 2017 Radiological Society of North America (RSNA) Bone Age Challenge, 1642 images of normal Dutch and Californian children, and 8250 images from Tübingen from patients with Short Stature, Congenital Adrenal Hyperplasia and Precocious Puberty. The study resulted in a cross-validated root mean square (RMS) error in the Tübingen images of 0.62 y, compared to 0.72 y in the previous version. The RMS error on the RSNA test set of 200 images was 0.45 y relative to the average of six manual ratings. The self-validation mechanism rejected 0.4% of the RSNA images. 121 outliers among the self-validated images of the Tübingen study were rerated, resulting in 6 cases where BoneXpert deviated more than 1.5 years from the average of the three re-ratings, compared to 72 such cases for the original manual ratings. The accuracy of BoneXpert is clearly better than the accuracy of a single manual rating. The self-validation mechanism rejected very few images, typically with abnormal anatomy, and among the accepted images, there were 12 times fewer severe bone age errors than in manual ratings, suggesting that BoneXpert could be safer than manual rating.https://doi.org/10.1038/s41598-022-10292-y
spellingShingle D. D. Martin
A. D. Calder
M. B. Ranke
G. Binder
H. H. Thodberg
Accuracy and self-validation of automated bone age determination
Scientific Reports
title Accuracy and self-validation of automated bone age determination
title_full Accuracy and self-validation of automated bone age determination
title_fullStr Accuracy and self-validation of automated bone age determination
title_full_unstemmed Accuracy and self-validation of automated bone age determination
title_short Accuracy and self-validation of automated bone age determination
title_sort accuracy and self validation of automated bone age determination
url https://doi.org/10.1038/s41598-022-10292-y
work_keys_str_mv AT ddmartin accuracyandselfvalidationofautomatedboneagedetermination
AT adcalder accuracyandselfvalidationofautomatedboneagedetermination
AT mbranke accuracyandselfvalidationofautomatedboneagedetermination
AT gbinder accuracyandselfvalidationofautomatedboneagedetermination
AT hhthodberg accuracyandselfvalidationofautomatedboneagedetermination