Finetuning of GLIDE stable diffusion model for AI-based text-conditional image synthesis of dermoscopic images
BackgroundThe development of artificial intelligence (AI)-based algorithms and advances in medical domains rely on large datasets. A recent advancement in text-to-image generative AI is GLIDE (Guided Language to Image Diffusion for Generation and Editing). There are a number of representations avail...
Main Authors: | , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2023-10-01
|
Series: | Frontiers in Medicine |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fmed.2023.1231436/full |
_version_ | 1797654415922954240 |
---|---|
author | Veronika Shavlokhova Andreas Vollmer Christos C. Zouboulis Michael Vollmer Jakob Wollborn Gernot Lang Alexander Kübler Stefan Hartmann Christian Stoll Elisabeth Roider Babak Saravi Babak Saravi |
author_facet | Veronika Shavlokhova Andreas Vollmer Christos C. Zouboulis Michael Vollmer Jakob Wollborn Gernot Lang Alexander Kübler Stefan Hartmann Christian Stoll Elisabeth Roider Babak Saravi Babak Saravi |
author_sort | Veronika Shavlokhova |
collection | DOAJ |
description | BackgroundThe development of artificial intelligence (AI)-based algorithms and advances in medical domains rely on large datasets. A recent advancement in text-to-image generative AI is GLIDE (Guided Language to Image Diffusion for Generation and Editing). There are a number of representations available in the GLIDE model, but it has not been refined for medical applications.MethodsFor text-conditional image synthesis with classifier-free guidance, we have fine-tuned GLIDE using 10,015 dermoscopic images of seven diagnostic entities, including melanoma and melanocytic nevi. Photorealistic synthetic samples of each diagnostic entity were created by the algorithm. Following this, an experienced dermatologist reviewed 140 images (20 of each entity), with 10 samples originating from artificial intelligence and 10 from original images from the dataset. The dermatologist classified the provided images according to the seven diagnostic entities. Additionally, the dermatologist was asked to indicate whether or not a particular image was created by AI. Further, we trained a deep learning model to compare the diagnostic results of dermatologist versus machine for entity classification.ResultsThe results indicate that the generated images possess varying degrees of quality and realism, with melanocytic nevi and melanoma having higher similarity to real images than other classes. The integration of synthetic images improved the classification performance of the model, resulting in higher accuracy and precision. The AI assessment showed superior classification performance compared to dermatologist.ConclusionOverall, the results highlight the potential of synthetic images for training and improving AI models in dermatology to overcome data scarcity. |
first_indexed | 2024-03-11T16:59:12Z |
format | Article |
id | doaj.art-525a35f588a744f6886b06ffd5ff679c |
institution | Directory Open Access Journal |
issn | 2296-858X |
language | English |
last_indexed | 2024-03-11T16:59:12Z |
publishDate | 2023-10-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Medicine |
spelling | doaj.art-525a35f588a744f6886b06ffd5ff679c2023-10-20T12:44:45ZengFrontiers Media S.A.Frontiers in Medicine2296-858X2023-10-011010.3389/fmed.2023.12314361231436Finetuning of GLIDE stable diffusion model for AI-based text-conditional image synthesis of dermoscopic imagesVeronika Shavlokhova0Andreas Vollmer1Christos C. Zouboulis2Michael Vollmer3Jakob Wollborn4Gernot Lang5Alexander Kübler6Stefan Hartmann7Christian Stoll8Elisabeth Roider9Babak Saravi10Babak Saravi11Maxillofacial Surgery University Hospital Ruppin-Fehrbelliner Straße Neuruppin, Neuruppin, GermanyDepartment of Oral and Maxillofacial Plastic Surgery, University Hospital of Würzburg, Würzburg, GermanyDepartments of Dermatology, Venereology, Allergology and Immunology, Staedtisches Klinikum Dessau, Medical School Theodor Fontane and Faculty of Health Sciences Brandenburg, Dessau, GermanyDepartment of Oral and Maxillofacial Surgery, Tuebingen University Hospital, Tuebingen, GermanyDepartment of Anesthesiology, Perioperative and Pain Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United StatesDepartment of Orthopedics and Trauma Surgery, Medical Centre-Albert-Ludwigs-University of Freiburg, Faculty of Medicine, Albert-Ludwigs-University of Freiburg, Freiburg, GermanyDepartment of Oral and Maxillofacial Plastic Surgery, University Hospital of Würzburg, Würzburg, GermanyDepartment of Oral and Maxillofacial Plastic Surgery, University Hospital of Würzburg, Würzburg, GermanyMaxillofacial Surgery University Hospital Ruppin-Fehrbelliner Straße Neuruppin, Neuruppin, GermanyDepartment of Dermatology, University Hospital of Basel, Basel, SwitzerlandDepartment of Anesthesiology, Perioperative and Pain Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United StatesDepartment of Orthopedics and Trauma Surgery, Medical Centre-Albert-Ludwigs-University of Freiburg, Faculty of Medicine, Albert-Ludwigs-University of Freiburg, Freiburg, GermanyBackgroundThe development of artificial intelligence (AI)-based algorithms and advances in medical domains rely on large datasets. A recent advancement in text-to-image generative AI is GLIDE (Guided Language to Image Diffusion for Generation and Editing). There are a number of representations available in the GLIDE model, but it has not been refined for medical applications.MethodsFor text-conditional image synthesis with classifier-free guidance, we have fine-tuned GLIDE using 10,015 dermoscopic images of seven diagnostic entities, including melanoma and melanocytic nevi. Photorealistic synthetic samples of each diagnostic entity were created by the algorithm. Following this, an experienced dermatologist reviewed 140 images (20 of each entity), with 10 samples originating from artificial intelligence and 10 from original images from the dataset. The dermatologist classified the provided images according to the seven diagnostic entities. Additionally, the dermatologist was asked to indicate whether or not a particular image was created by AI. Further, we trained a deep learning model to compare the diagnostic results of dermatologist versus machine for entity classification.ResultsThe results indicate that the generated images possess varying degrees of quality and realism, with melanocytic nevi and melanoma having higher similarity to real images than other classes. The integration of synthetic images improved the classification performance of the model, resulting in higher accuracy and precision. The AI assessment showed superior classification performance compared to dermatologist.ConclusionOverall, the results highlight the potential of synthetic images for training and improving AI models in dermatology to overcome data scarcity.https://www.frontiersin.org/articles/10.3389/fmed.2023.1231436/fullGLIDEtext-to-imagestable diffusiondermoscopycancerdermatology |
spellingShingle | Veronika Shavlokhova Andreas Vollmer Christos C. Zouboulis Michael Vollmer Jakob Wollborn Gernot Lang Alexander Kübler Stefan Hartmann Christian Stoll Elisabeth Roider Babak Saravi Babak Saravi Finetuning of GLIDE stable diffusion model for AI-based text-conditional image synthesis of dermoscopic images Frontiers in Medicine GLIDE text-to-image stable diffusion dermoscopy cancer dermatology |
title | Finetuning of GLIDE stable diffusion model for AI-based text-conditional image synthesis of dermoscopic images |
title_full | Finetuning of GLIDE stable diffusion model for AI-based text-conditional image synthesis of dermoscopic images |
title_fullStr | Finetuning of GLIDE stable diffusion model for AI-based text-conditional image synthesis of dermoscopic images |
title_full_unstemmed | Finetuning of GLIDE stable diffusion model for AI-based text-conditional image synthesis of dermoscopic images |
title_short | Finetuning of GLIDE stable diffusion model for AI-based text-conditional image synthesis of dermoscopic images |
title_sort | finetuning of glide stable diffusion model for ai based text conditional image synthesis of dermoscopic images |
topic | GLIDE text-to-image stable diffusion dermoscopy cancer dermatology |
url | https://www.frontiersin.org/articles/10.3389/fmed.2023.1231436/full |
work_keys_str_mv | AT veronikashavlokhova finetuningofglidestablediffusionmodelforaibasedtextconditionalimagesynthesisofdermoscopicimages AT andreasvollmer finetuningofglidestablediffusionmodelforaibasedtextconditionalimagesynthesisofdermoscopicimages AT christosczouboulis finetuningofglidestablediffusionmodelforaibasedtextconditionalimagesynthesisofdermoscopicimages AT michaelvollmer finetuningofglidestablediffusionmodelforaibasedtextconditionalimagesynthesisofdermoscopicimages AT jakobwollborn finetuningofglidestablediffusionmodelforaibasedtextconditionalimagesynthesisofdermoscopicimages AT gernotlang finetuningofglidestablediffusionmodelforaibasedtextconditionalimagesynthesisofdermoscopicimages AT alexanderkubler finetuningofglidestablediffusionmodelforaibasedtextconditionalimagesynthesisofdermoscopicimages AT stefanhartmann finetuningofglidestablediffusionmodelforaibasedtextconditionalimagesynthesisofdermoscopicimages AT christianstoll finetuningofglidestablediffusionmodelforaibasedtextconditionalimagesynthesisofdermoscopicimages AT elisabethroider finetuningofglidestablediffusionmodelforaibasedtextconditionalimagesynthesisofdermoscopicimages AT babaksaravi finetuningofglidestablediffusionmodelforaibasedtextconditionalimagesynthesisofdermoscopicimages AT babaksaravi finetuningofglidestablediffusionmodelforaibasedtextconditionalimagesynthesisofdermoscopicimages |