Sequential Brain CT Image Captioning Based on the Pre-Trained Classifiers and a Language Model

Intracerebral hemorrhage (ICH) is a severe cerebrovascular disorder that poses a life-threatening risk, necessitating swift diagnosis and treatment. While CT scans are the most effective diagnostic tool for detecting cerebral hemorrhage, their interpretation typically requires the expertise of skill...

Full description

Bibliographic Details
Main Authors: Jin-Woo Kong, Byoung-Doo Oh, Chulho Kim, Yu-Seop Kim
Format: Article
Language:English
Published: MDPI AG 2024-01-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/14/3/1193
_version_ 1797319017155788800
author Jin-Woo Kong
Byoung-Doo Oh
Chulho Kim
Yu-Seop Kim
author_facet Jin-Woo Kong
Byoung-Doo Oh
Chulho Kim
Yu-Seop Kim
author_sort Jin-Woo Kong
collection DOAJ
description Intracerebral hemorrhage (ICH) is a severe cerebrovascular disorder that poses a life-threatening risk, necessitating swift diagnosis and treatment. While CT scans are the most effective diagnostic tool for detecting cerebral hemorrhage, their interpretation typically requires the expertise of skilled professionals. However, in regions with a shortage of such experts or situations with time constraints, delays in diagnosis may occur. In this paper, we propose a method that combines a pre-trained CNN classifier and GPT-2 to generate text for sequentially acquired ICH CT images. Initially, CNN undergoes fine-tuning by learning the presence of ICH in publicly available single CT images, and subsequently, it extracts feature vectors (i.e., matrix) from 3D ICH CT images. These vectors are input along with text into GPT-2, which is trained to generate text for consecutive CT images. In experiments, we evaluated the performance of four models to determine the most suitable image captioning model: (1) In the N-gram-based method, ReseNet50V2 and DenseNet121 showed relatively high scores. (2) In the embedding-based method, DenseNet121 exhibited the best performance. (3) Overall, the models showed good performance in BERT score. Our proposed method presents an automatic and valuable approach for analyzing 3D ICH CT images, contributing to the efficiency of ICH diagnosis and treatment.
first_indexed 2024-03-08T04:00:44Z
format Article
id doaj.art-9b3237700e6a44d3a8a333f6ca097545
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-08T04:00:44Z
publishDate 2024-01-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-9b3237700e6a44d3a8a333f6ca0975452024-02-09T15:08:11ZengMDPI AGApplied Sciences2076-34172024-01-01143119310.3390/app14031193Sequential Brain CT Image Captioning Based on the Pre-Trained Classifiers and a Language ModelJin-Woo Kong0Byoung-Doo Oh1Chulho Kim2Yu-Seop Kim3Department of Convergence Software, Hallym University, Chuncheon-si 24252, Gangwon-do, Republic of KoreaCerebrovascular Disease Research Center, Hallym University, Chuncheon-si 24252, Gangwon-do, Republic of KoreaDepartment of Neurology, Chuncheon Sacred Heart Hospital, Chuncheon-si 24253, Gangwon-do, Republic of KoreaDepartment of Convergence Software, Hallym University, Chuncheon-si 24252, Gangwon-do, Republic of KoreaIntracerebral hemorrhage (ICH) is a severe cerebrovascular disorder that poses a life-threatening risk, necessitating swift diagnosis and treatment. While CT scans are the most effective diagnostic tool for detecting cerebral hemorrhage, their interpretation typically requires the expertise of skilled professionals. However, in regions with a shortage of such experts or situations with time constraints, delays in diagnosis may occur. In this paper, we propose a method that combines a pre-trained CNN classifier and GPT-2 to generate text for sequentially acquired ICH CT images. Initially, CNN undergoes fine-tuning by learning the presence of ICH in publicly available single CT images, and subsequently, it extracts feature vectors (i.e., matrix) from 3D ICH CT images. These vectors are input along with text into GPT-2, which is trained to generate text for consecutive CT images. In experiments, we evaluated the performance of four models to determine the most suitable image captioning model: (1) In the N-gram-based method, ReseNet50V2 and DenseNet121 showed relatively high scores. (2) In the embedding-based method, DenseNet121 exhibited the best performance. (3) Overall, the models showed good performance in BERT score. Our proposed method presents an automatic and valuable approach for analyzing 3D ICH CT images, contributing to the efficiency of ICH diagnosis and treatment.https://www.mdpi.com/2076-3417/14/3/1193intracerebral hmorrhagemedical image captioningdeep learningconvolutional neural networkGPT-2
spellingShingle Jin-Woo Kong
Byoung-Doo Oh
Chulho Kim
Yu-Seop Kim
Sequential Brain CT Image Captioning Based on the Pre-Trained Classifiers and a Language Model
Applied Sciences
intracerebral hmorrhage
medical image captioning
deep learning
convolutional neural network
GPT-2
title Sequential Brain CT Image Captioning Based on the Pre-Trained Classifiers and a Language Model
title_full Sequential Brain CT Image Captioning Based on the Pre-Trained Classifiers and a Language Model
title_fullStr Sequential Brain CT Image Captioning Based on the Pre-Trained Classifiers and a Language Model
title_full_unstemmed Sequential Brain CT Image Captioning Based on the Pre-Trained Classifiers and a Language Model
title_short Sequential Brain CT Image Captioning Based on the Pre-Trained Classifiers and a Language Model
title_sort sequential brain ct image captioning based on the pre trained classifiers and a language model
topic intracerebral hmorrhage
medical image captioning
deep learning
convolutional neural network
GPT-2
url https://www.mdpi.com/2076-3417/14/3/1193
work_keys_str_mv AT jinwookong sequentialbrainctimagecaptioningbasedonthepretrainedclassifiersandalanguagemodel
AT byoungdoooh sequentialbrainctimagecaptioningbasedonthepretrainedclassifiersandalanguagemodel
AT chulhokim sequentialbrainctimagecaptioningbasedonthepretrainedclassifiersandalanguagemodel
AT yuseopkim sequentialbrainctimagecaptioningbasedonthepretrainedclassifiersandalanguagemodel