Skin Medical Image Captioning Using Multi-Label Classification and Siamese Network
Image captioning is a process of automatically generating descriptive sentences for a given image. Text-to-image search is a form of search in which images are retrieved based on matching keywords and image features. We focus on the case in which multiple description sentences are generated for one...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10054038/ |
_version_ | 1797870678101196800 |
---|---|
author | Yihlon Lin Kuiyou Lai Wenyu Chang |
author_facet | Yihlon Lin Kuiyou Lai Wenyu Chang |
author_sort | Yihlon Lin |
collection | DOAJ |
description | Image captioning is a process of automatically generating descriptive sentences for a given image. Text-to-image search is a form of search in which images are retrieved based on matching keywords and image features. We focus on the case in which multiple description sentences are generated for one image. In this study, we used four learning models: 1) a discriminator, which is a binary classifier that distinguishes skin from background using image segmentation; 2) an autoencoder; 3) a multiclass classification model combining the features from the discriminator and autoencoder and producing keyword labels; and 4) a Siamese network learning the textual similarity matching between colloquial description sentences of skin imaging pathology and keywords produced from the multi-class classifier. The experimental results show that the proposed method yields an accuracy of up to 99% for the testing data in terms of colloquial language of skin images. This study enabled users to read the skin. For teaching research on skin diagnosis, the proposed method can significantly relieve the shortage of training personnel and assist hospitals that lack resources for conducting case studies. The results of this study are expected to be feasible and can be applied in actual clinical teaching. For medical education in dermatology, the findings of this study contribute to the practical value of quantitative indicators and assessments for learning outcomes of medical students. |
first_indexed | 2024-04-10T00:32:18Z |
format | Article |
id | doaj.art-a193cc24f5de43a1bfd76edeb34b186d |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-10T00:32:18Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-a193cc24f5de43a1bfd76edeb34b186d2023-03-14T23:00:28ZengIEEEIEEE Access2169-35362023-01-0111234472345410.1109/ACCESS.2023.324946210054038Skin Medical Image Captioning Using Multi-Label Classification and Siamese NetworkYihlon Lin0https://orcid.org/0000-0002-8215-2977Kuiyou Lai1Wenyu Chang2https://orcid.org/0000-0003-0393-1030Department of Computer Science and Information Engineering, National Yunlin University of Science and Technology, Douliu, Yunlin, TaiwanDepartment of Computer Science and Information Engineering, National Yunlin University of Science and Technology, Douliu, Yunlin, TaiwanDepartment of Dermatology, E-Da Cancer Hospital, I-Shou University, Kaohsiung, TaiwanImage captioning is a process of automatically generating descriptive sentences for a given image. Text-to-image search is a form of search in which images are retrieved based on matching keywords and image features. We focus on the case in which multiple description sentences are generated for one image. In this study, we used four learning models: 1) a discriminator, which is a binary classifier that distinguishes skin from background using image segmentation; 2) an autoencoder; 3) a multiclass classification model combining the features from the discriminator and autoencoder and producing keyword labels; and 4) a Siamese network learning the textual similarity matching between colloquial description sentences of skin imaging pathology and keywords produced from the multi-class classifier. The experimental results show that the proposed method yields an accuracy of up to 99% for the testing data in terms of colloquial language of skin images. This study enabled users to read the skin. For teaching research on skin diagnosis, the proposed method can significantly relieve the shortage of training personnel and assist hospitals that lack resources for conducting case studies. The results of this study are expected to be feasible and can be applied in actual clinical teaching. For medical education in dermatology, the findings of this study contribute to the practical value of quantitative indicators and assessments for learning outcomes of medical students.https://ieeexplore.ieee.org/document/10054038/Fully convolutional networkimage captiondiscriminatorautoencodermulti-label classificationSiamese network |
spellingShingle | Yihlon Lin Kuiyou Lai Wenyu Chang Skin Medical Image Captioning Using Multi-Label Classification and Siamese Network IEEE Access Fully convolutional network image caption discriminator autoencoder multi-label classification Siamese network |
title | Skin Medical Image Captioning Using Multi-Label Classification and Siamese Network |
title_full | Skin Medical Image Captioning Using Multi-Label Classification and Siamese Network |
title_fullStr | Skin Medical Image Captioning Using Multi-Label Classification and Siamese Network |
title_full_unstemmed | Skin Medical Image Captioning Using Multi-Label Classification and Siamese Network |
title_short | Skin Medical Image Captioning Using Multi-Label Classification and Siamese Network |
title_sort | skin medical image captioning using multi label classification and siamese network |
topic | Fully convolutional network image caption discriminator autoencoder multi-label classification Siamese network |
url | https://ieeexplore.ieee.org/document/10054038/ |
work_keys_str_mv | AT yihlonlin skinmedicalimagecaptioningusingmultilabelclassificationandsiamesenetwork AT kuiyoulai skinmedicalimagecaptioningusingmultilabelclassificationandsiamesenetwork AT wenyuchang skinmedicalimagecaptioningusingmultilabelclassificationandsiamesenetwork |