A Study on Generating Webtoons Using Multilingual Text-to-Image Models

Text-to-image technology enables computers to create images from text by simulating the human process of forming mental images. GAN-based text-to-image technology involves extracting features from input text; subsequently, they are combined with noise and used as input to a GAN, which generates imag...

Full description

Bibliographic Details
Main Authors: Kyungho Yu, Hyoungju Kim, Jeongin Kim, Chanjun Chun, Pankoo Kim
Format: Article
Language:English
Published: MDPI AG 2023-06-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/12/7278
_version_ 1797596225391820800
author Kyungho Yu
Hyoungju Kim
Jeongin Kim
Chanjun Chun
Pankoo Kim
author_facet Kyungho Yu
Hyoungju Kim
Jeongin Kim
Chanjun Chun
Pankoo Kim
author_sort Kyungho Yu
collection DOAJ
description Text-to-image technology enables computers to create images from text by simulating the human process of forming mental images. GAN-based text-to-image technology involves extracting features from input text; subsequently, they are combined with noise and used as input to a GAN, which generates images similar to the original images via competition between the generator and discriminator. Although images have been extensively generated from English text, text-to-image technology based on multilingualism, such as Korean, is in its developmental stage. Webtoons are digital comic formats for viewing comics online. The webtoon creation process involves story planning, content/sketching, coloring, and background drawing, all of which require human intervention, thus being time-consuming and expensive. Therefore, this study proposes a multilingual text-to-image model capable of generating webtoon images when presented with multilingual input text. The proposed model employs multilingual BERT to extract feature vectors for multiple languages and trains a DCGAN in conjunction with the images. The experimental results demonstrate that the model can generate images similar to the original images when presented with multilingual input text after training. The evaluation metrics further support these findings, as the generated images achieved an Inception score of 4.99 and an FID score of 22.21.
first_indexed 2024-03-11T02:47:34Z
format Article
id doaj.art-47cb69c9d0844f0ba4c4e6b3fc5234b8
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-11T02:47:34Z
publishDate 2023-06-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-47cb69c9d0844f0ba4c4e6b3fc5234b82023-11-18T09:11:38ZengMDPI AGApplied Sciences2076-34172023-06-011312727810.3390/app13127278A Study on Generating Webtoons Using Multilingual Text-to-Image ModelsKyungho Yu0Hyoungju Kim1Jeongin Kim2Chanjun Chun3Pankoo Kim4Department of Computer Engineering, Chosun University, 309 Pilmun-Daero, Dong-Gu, Gwangju 61452, Republic of KoreaDepartment of Computer Engineering, Chosun University, 309 Pilmun-Daero, Dong-Gu, Gwangju 61452, Republic of KoreaDepartment of Computer Engineering, Chosun University, 309 Pilmun-Daero, Dong-Gu, Gwangju 61452, Republic of KoreaDepartment of Computer Engineering, Chosun University, 309 Pilmun-Daero, Dong-Gu, Gwangju 61452, Republic of KoreaDepartment of Computer Engineering, Chosun University, 309 Pilmun-Daero, Dong-Gu, Gwangju 61452, Republic of KoreaText-to-image technology enables computers to create images from text by simulating the human process of forming mental images. GAN-based text-to-image technology involves extracting features from input text; subsequently, they are combined with noise and used as input to a GAN, which generates images similar to the original images via competition between the generator and discriminator. Although images have been extensively generated from English text, text-to-image technology based on multilingualism, such as Korean, is in its developmental stage. Webtoons are digital comic formats for viewing comics online. The webtoon creation process involves story planning, content/sketching, coloring, and background drawing, all of which require human intervention, thus being time-consuming and expensive. Therefore, this study proposes a multilingual text-to-image model capable of generating webtoon images when presented with multilingual input text. The proposed model employs multilingual BERT to extract feature vectors for multiple languages and trains a DCGAN in conjunction with the images. The experimental results demonstrate that the model can generate images similar to the original images when presented with multilingual input text after training. The evaluation metrics further support these findings, as the generated images achieved an Inception score of 4.99 and an FID score of 22.21.https://www.mdpi.com/2076-3417/13/12/7278multilingual BERTtext-to-imageDCGANwebtoonGAN
spellingShingle Kyungho Yu
Hyoungju Kim
Jeongin Kim
Chanjun Chun
Pankoo Kim
A Study on Generating Webtoons Using Multilingual Text-to-Image Models
Applied Sciences
multilingual BERT
text-to-image
DCGAN
webtoon
GAN
title A Study on Generating Webtoons Using Multilingual Text-to-Image Models
title_full A Study on Generating Webtoons Using Multilingual Text-to-Image Models
title_fullStr A Study on Generating Webtoons Using Multilingual Text-to-Image Models
title_full_unstemmed A Study on Generating Webtoons Using Multilingual Text-to-Image Models
title_short A Study on Generating Webtoons Using Multilingual Text-to-Image Models
title_sort study on generating webtoons using multilingual text to image models
topic multilingual BERT
text-to-image
DCGAN
webtoon
GAN
url https://www.mdpi.com/2076-3417/13/12/7278
work_keys_str_mv AT kyunghoyu astudyongeneratingwebtoonsusingmultilingualtexttoimagemodels
AT hyoungjukim astudyongeneratingwebtoonsusingmultilingualtexttoimagemodels
AT jeonginkim astudyongeneratingwebtoonsusingmultilingualtexttoimagemodels
AT chanjunchun astudyongeneratingwebtoonsusingmultilingualtexttoimagemodels
AT pankookim astudyongeneratingwebtoonsusingmultilingualtexttoimagemodels
AT kyunghoyu studyongeneratingwebtoonsusingmultilingualtexttoimagemodels
AT hyoungjukim studyongeneratingwebtoonsusingmultilingualtexttoimagemodels
AT jeonginkim studyongeneratingwebtoonsusingmultilingualtexttoimagemodels
AT chanjunchun studyongeneratingwebtoonsusingmultilingualtexttoimagemodels
AT pankookim studyongeneratingwebtoonsusingmultilingualtexttoimagemodels