Image annotation and curation in radiology: an overview for machine learning practitioners

Abstract “Garbage in, garbage out” summarises well the importance of high-quality data in machine learning and artificial intelligence. All data used to train and validate models should indeed be consistent, standardised, traceable, correctly annotated, and de-identified, considering local regulatio...

Full description

Bibliographic Details
Main Authors: Fabio Galbusera, Andrea Cina
Format: Article
Language:English
Published: SpringerOpen 2024-02-01
Series:European Radiology Experimental
Subjects:
Online Access:https://doi.org/10.1186/s41747-023-00408-y
_version_ 1797275907725983744
author Fabio Galbusera
Andrea Cina
author_facet Fabio Galbusera
Andrea Cina
author_sort Fabio Galbusera
collection DOAJ
description Abstract “Garbage in, garbage out” summarises well the importance of high-quality data in machine learning and artificial intelligence. All data used to train and validate models should indeed be consistent, standardised, traceable, correctly annotated, and de-identified, considering local regulations. This narrative review presents a summary of the techniques that are used to ensure that all these requirements are fulfilled, with special emphasis on radiological imaging and freely available software solutions that can be directly employed by the interested researcher. Topics discussed include key imaging concepts, such as image resolution and pixel depth; file formats for medical image data storage; free software solutions for medical image processing; anonymisation and pseudonymisation to protect patient privacy, including compliance with regulations such as the Regulation (EU) 2016/679 “General Data Protection Regulation” (GDPR) and the 1996 United States Act of Congress “Health Insurance Portability and Accountability Act” (HIPAA); methods to eliminate patient-identifying features within images, like facial structures; free and commercial tools for image annotation; and techniques for data harmonisation and normalisation. Relevance statement This review provides an overview of the methods and tools that can be used to ensure high-quality data for machine learning and artificial intelligence applications in radiology. Key points • High-quality datasets are essential for reliable artificial intelligence algorithms in medical imaging. • Software tools like ImageJ and 3D Slicer aid in processing medical images for AI research. • Anonymisation techniques protect patient privacy during dataset preparation. • Machine learning models can accelerate image annotation, enhancing efficiency and accuracy. • Data curation ensures dataset integrity, compliance, and quality for artificial intelligence development. Graphical Abstract
first_indexed 2024-03-07T15:20:45Z
format Article
id doaj.art-36d6f3ed59d3482f853d3b166b2812c2
institution Directory Open Access Journal
issn 2509-9280
language English
last_indexed 2024-03-07T15:20:45Z
publishDate 2024-02-01
publisher SpringerOpen
record_format Article
series European Radiology Experimental
spelling doaj.art-36d6f3ed59d3482f853d3b166b2812c22024-03-05T17:38:14ZengSpringerOpenEuropean Radiology Experimental2509-92802024-02-018111210.1186/s41747-023-00408-yImage annotation and curation in radiology: an overview for machine learning practitionersFabio Galbusera0Andrea Cina1Spine Center, Schulthess ClinicSpine Center, Schulthess ClinicAbstract “Garbage in, garbage out” summarises well the importance of high-quality data in machine learning and artificial intelligence. All data used to train and validate models should indeed be consistent, standardised, traceable, correctly annotated, and de-identified, considering local regulations. This narrative review presents a summary of the techniques that are used to ensure that all these requirements are fulfilled, with special emphasis on radiological imaging and freely available software solutions that can be directly employed by the interested researcher. Topics discussed include key imaging concepts, such as image resolution and pixel depth; file formats for medical image data storage; free software solutions for medical image processing; anonymisation and pseudonymisation to protect patient privacy, including compliance with regulations such as the Regulation (EU) 2016/679 “General Data Protection Regulation” (GDPR) and the 1996 United States Act of Congress “Health Insurance Portability and Accountability Act” (HIPAA); methods to eliminate patient-identifying features within images, like facial structures; free and commercial tools for image annotation; and techniques for data harmonisation and normalisation. Relevance statement This review provides an overview of the methods and tools that can be used to ensure high-quality data for machine learning and artificial intelligence applications in radiology. Key points • High-quality datasets are essential for reliable artificial intelligence algorithms in medical imaging. • Software tools like ImageJ and 3D Slicer aid in processing medical images for AI research. • Anonymisation techniques protect patient privacy during dataset preparation. • Machine learning models can accelerate image annotation, enhancing efficiency and accuracy. • Data curation ensures dataset integrity, compliance, and quality for artificial intelligence development. Graphical Abstracthttps://doi.org/10.1186/s41747-023-00408-yArtificial intelligenceData curationImage processing (computer-assisted)Machine learningPrivacy
spellingShingle Fabio Galbusera
Andrea Cina
Image annotation and curation in radiology: an overview for machine learning practitioners
European Radiology Experimental
Artificial intelligence
Data curation
Image processing (computer-assisted)
Machine learning
Privacy
title Image annotation and curation in radiology: an overview for machine learning practitioners
title_full Image annotation and curation in radiology: an overview for machine learning practitioners
title_fullStr Image annotation and curation in radiology: an overview for machine learning practitioners
title_full_unstemmed Image annotation and curation in radiology: an overview for machine learning practitioners
title_short Image annotation and curation in radiology: an overview for machine learning practitioners
title_sort image annotation and curation in radiology an overview for machine learning practitioners
topic Artificial intelligence
Data curation
Image processing (computer-assisted)
Machine learning
Privacy
url https://doi.org/10.1186/s41747-023-00408-y
work_keys_str_mv AT fabiogalbusera imageannotationandcurationinradiologyanoverviewformachinelearningpractitioners
AT andreacina imageannotationandcurationinradiologyanoverviewformachinelearningpractitioners