$An increasing number of convolutional neural networks for fracture recognition and classification in orthopaedics: are these externally validated and ready for clinical application?$

An increasing number of convolutional neural networks for fracture recognition and classification in orthopaedics: are these externally validated and ready for clinical application?

Aims: The number of convolutional neural networks (CNN) available for fracture detection and classification is rapidly increasing. External validation of a CNN on a temporally separate (separated by time) or geographically separate (separated by location) dataset is crucial to assess generalizabilit...

Full description

Bibliographic Details
Main Authors:	Luisa Oliveira e Carmo, Anke van den Merkhof, Jakub Olczak, Max Gordon, Paul C. Jutte, Ruurd L. Jaarsma, Frank F. A. IJpma, Job N. Doornberg, Jasper Prijs, Machine Learning Consortium
Format:	Article
Language:	English
Published:	The British Editorial Society of Bone & Joint Surgery 2021-10-01
Series:	Bone & Joint Open
Subjects:	artificial intelligence external validation convolutional neural networks machine learning deep learning orthopaedic trauma prognosis radiographs orthopaedic surgeons elbows ct scans hip distal radius fractures variances cadaveric studies
Online Access:	https://online.boneandjoint.org.uk/doi/epdf/10.1302/2633-1462.210.BJO-2021-0133

_version_	1818824866485239808
author	Luisa Oliveira e Carmo Anke van den Merkhof Jakub Olczak Max Gordon Paul C. Jutte Ruurd L. Jaarsma Frank F. A. IJpma Job N. Doornberg Jasper Prijs Machine Learning Consortium
author_facet	Luisa Oliveira e Carmo Anke van den Merkhof Jakub Olczak Max Gordon Paul C. Jutte Ruurd L. Jaarsma Frank F. A. IJpma Job N. Doornberg Jasper Prijs Machine Learning Consortium
author_sort	Luisa Oliveira e Carmo
collection	DOAJ
description	Aims: The number of convolutional neural networks (CNN) available for fracture detection and classification is rapidly increasing. External validation of a CNN on a temporally separate (separated by time) or geographically separate (separated by location) dataset is crucial to assess generalizability of the CNN before application to clinical practice in other institutions. We aimed to answer the following questions: are current CNNs for fracture recognition externally valid?; which methods are applied for external validation (EV)?; and, what are reported performances of the EV sets compared to the internal validation (IV) sets of these CNNs? Methods: The PubMed and Embase databases were systematically searched from January 2010 to October 2020 according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement. The type of EV, characteristics of the external dataset, and diagnostic performance characteristics on the IV and EV datasets were collected and compared. Quality assessment was conducted using a seven-item checklist based on a modified Methodologic Index for NOn-Randomized Studies instrument (MINORS). Results: Out of 1,349 studies, 36 reported development of a CNN for fracture detection and/or classification. Of these, only four (11%) reported a form of EV. One study used temporal EV, one conducted both temporal and geographical EV, and two used geographical EV. When comparing the CNN’s performance on the IV set versus the EV set, the following were found: AUCs of 0.967 (IV) versus 0.975 (EV), 0.976 (IV) versus 0.985 to 0.992 (EV), 0.93 to 0.96 (IV) versus 0.80 to 0.89 (EV), and F1-scores of 0.856 to 0.863 (IV) versus 0.757 to 0.840 (EV). Conclusion: The number of externally validated CNNs in orthopaedic trauma for fracture recognition is still scarce. This greatly limits the potential for transfer of these CNNs from the developing institute to another hospital to achieve similar diagnostic performance. We recommend the use of geographical EV and statements such as the Consolidated Standards of Reporting Trials–Artificial Intelligence (CONSORT-AI), the Standard Protocol Items: Recommendations for Interventional Trials–Artificial Intelligence (SPIRIT-AI) and the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis–Machine Learning (TRIPOD-ML) to critically appraise performance of CNNs and improve methodological rigor, quality of future models, and facilitate eventual implementation in clinical practice.
first_indexed	2024-12-19T00:02:41Z
format	Article
id	doaj.art-2a4623a5c87943efad356faa435f1d00
institution	Directory Open Access Journal
issn	2633-1462
language	English
last_indexed	2024-12-19T00:02:41Z
publishDate	2021-10-01
publisher	The British Editorial Society of Bone & Joint Surgery
record_format	Article
series	Bone & Joint Open
spelling	doaj.art-2a4623a5c87943efad356faa435f1d002022-12-21T20:46:23ZengThe British Editorial Society of Bone & Joint SurgeryBone & Joint Open2633-14622021-10-0121087988510.1302/2633-1462.210.BJO-2021-0133An increasing number of convolutional neural networks for fracture recognition and classification in orthopaedics: are these externally validated and ready for clinical application?Luisa Oliveira e Carmo0Anke van den Merkhof1Jakub Olczak2Max Gordon3Paul C. Jutte4Ruurd L. Jaarsma5Frank F. A. IJpma6Job N. Doornberg7Jasper Prijs8Machine Learning ConsortiumDepartment of Orthopaedic Surgery, University Medical Centre, University of Groningen, Groningen, Groningen, NetherlandsDepartment of Orthopaedic Surgery, Flinders Medical Centre, Bedford Park, Adelaide, South Australia, AustraliaInstitute of Clinical Sciences, Danderyd University Hospital, Karolinska Institute, Stockholm, SwedenInstitute of Clinical Sciences, Danderyd University Hospital, Karolinska Institute, Stockholm, SwedenDepartment of Orthopaedic Surgery, University Medical Centre, University of Groningen, Groningen, Groningen, NetherlandsDepartment of Orthopaedic Surgery, Flinders Medical Centre, Bedford Park, Adelaide, South Australia, AustraliaDepartment of Trauma Surgery, University Medical Centre Groningen, University of Groningen, Groningen, Groningen, NetherlandsDepartment of Orthopaedic Surgery, University Medical Centre, University of Groningen, Groningen, Groningen, NetherlandsDepartment of Orthopaedic Surgery, University Medical Centre, University of Groningen, Groningen, Groningen, NetherlandsAims: The number of convolutional neural networks (CNN) available for fracture detection and classification is rapidly increasing. External validation of a CNN on a temporally separate (separated by time) or geographically separate (separated by location) dataset is crucial to assess generalizability of the CNN before application to clinical practice in other institutions. We aimed to answer the following questions: are current CNNs for fracture recognition externally valid?; which methods are applied for external validation (EV)?; and, what are reported performances of the EV sets compared to the internal validation (IV) sets of these CNNs? Methods: The PubMed and Embase databases were systematically searched from January 2010 to October 2020 according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement. The type of EV, characteristics of the external dataset, and diagnostic performance characteristics on the IV and EV datasets were collected and compared. Quality assessment was conducted using a seven-item checklist based on a modified Methodologic Index for NOn-Randomized Studies instrument (MINORS). Results: Out of 1,349 studies, 36 reported development of a CNN for fracture detection and/or classification. Of these, only four (11%) reported a form of EV. One study used temporal EV, one conducted both temporal and geographical EV, and two used geographical EV. When comparing the CNN’s performance on the IV set versus the EV set, the following were found: AUCs of 0.967 (IV) versus 0.975 (EV), 0.976 (IV) versus 0.985 to 0.992 (EV), 0.93 to 0.96 (IV) versus 0.80 to 0.89 (EV), and F1-scores of 0.856 to 0.863 (IV) versus 0.757 to 0.840 (EV). Conclusion: The number of externally validated CNNs in orthopaedic trauma for fracture recognition is still scarce. This greatly limits the potential for transfer of these CNNs from the developing institute to another hospital to achieve similar diagnostic performance. We recommend the use of geographical EV and statements such as the Consolidated Standards of Reporting Trials–Artificial Intelligence (CONSORT-AI), the Standard Protocol Items: Recommendations for Interventional Trials–Artificial Intelligence (SPIRIT-AI) and the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis–Machine Learning (TRIPOD-ML) to critically appraise performance of CNNs and improve methodological rigor, quality of future models, and facilitate eventual implementation in clinical practice.https://online.boneandjoint.org.uk/doi/epdf/10.1302/2633-1462.210.BJO-2021-0133artificial intelligenceexternal validationconvolutional neural networksmachine learningdeep learningorthopaedic traumaprognosisradiographsorthopaedic surgeonselbowsct scanshipdistal radius fracturesvariancescadaveric studies
spellingShingle	Luisa Oliveira e Carmo Anke van den Merkhof Jakub Olczak Max Gordon Paul C. Jutte Ruurd L. Jaarsma Frank F. A. IJpma Job N. Doornberg Jasper Prijs Machine Learning Consortium An increasing number of convolutional neural networks for fracture recognition and classification in orthopaedics: are these externally validated and ready for clinical application? Bone & Joint Open artificial intelligence external validation convolutional neural networks machine learning deep learning orthopaedic trauma prognosis radiographs orthopaedic surgeons elbows ct scans hip distal radius fractures variances cadaveric studies
title	An increasing number of convolutional neural networks for fracture recognition and classification in orthopaedics: are these externally validated and ready for clinical application?
title_full	An increasing number of convolutional neural networks for fracture recognition and classification in orthopaedics: are these externally validated and ready for clinical application?
title_fullStr	An increasing number of convolutional neural networks for fracture recognition and classification in orthopaedics: are these externally validated and ready for clinical application?
title_full_unstemmed	An increasing number of convolutional neural networks for fracture recognition and classification in orthopaedics: are these externally validated and ready for clinical application?
title_short	An increasing number of convolutional neural networks for fracture recognition and classification in orthopaedics: are these externally validated and ready for clinical application?
title_sort	increasing number of convolutional neural networks for fracture recognition and classification in orthopaedics are these externally validated and ready for clinical application
topic	artificial intelligence external validation convolutional neural networks machine learning deep learning orthopaedic trauma prognosis radiographs orthopaedic surgeons elbows ct scans hip distal radius fractures variances cadaveric studies
url	https://online.boneandjoint.org.uk/doi/epdf/10.1302/2633-1462.210.BJO-2021-0133
work_keys_str_mv	AT luisaoliveiraecarmo anincreasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT ankevandenmerkhof anincreasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT jakubolczak anincreasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT maxgordon anincreasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT paulcjutte anincreasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT ruurdljaarsma anincreasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT frankfaijpma anincreasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT jobndoornberg anincreasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT jasperprijs anincreasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT machinelearningconsortium anincreasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT luisaoliveiraecarmo increasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT ankevandenmerkhof increasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT jakubolczak increasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT maxgordon increasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT paulcjutte increasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT ruurdljaarsma increasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT frankfaijpma increasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT jobndoornberg increasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT jasperprijs increasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication AT machinelearningconsortium increasingnumberofconvolutionalneuralnetworksforfracturerecognitionandclassificationinorthopaedicsaretheseexternallyvalidatedandreadyforclinicalapplication

An increasing number of convolutional neural networks for fracture recognition and classification in orthopaedics: are these externally validated and ready for clinical application?

Similar Items