Optimizing Prompts Using In-Context Few-Shot Learning for Text-to-Image Generative Models

Recently, various text-to-image generative models have been released, demonstrating their ability to generate high-quality synthesized images from text prompts. Despite these advancements, determining the appropriate text prompts to obtain desired images remains challenging. The quality of the synth...

Full description

Bibliographic Details
Main Authors: Seunghun Lee, Jihoon Lee, Chan Ho Bae, Myung-Seok Choi, Ryong Lee, Sangtae Ahn
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10378642/
_version_ 1827386991153512448
author Seunghun Lee
Jihoon Lee
Chan Ho Bae
Myung-Seok Choi
Ryong Lee
Sangtae Ahn
author_facet Seunghun Lee
Jihoon Lee
Chan Ho Bae
Myung-Seok Choi
Ryong Lee
Sangtae Ahn
author_sort Seunghun Lee
collection DOAJ
description Recently, various text-to-image generative models have been released, demonstrating their ability to generate high-quality synthesized images from text prompts. Despite these advancements, determining the appropriate text prompts to obtain desired images remains challenging. The quality of the synthesized images heavily depends on the user input, making it difficult to achieve consistent and satisfactory results. This limitation has sparked the need for an effective prompt optimization method to generate optimized text prompts automatically for text-to-image generative models. Thus, this study proposes a prompt optimization method that uses in-context few-shot learning in a pretrained language model. The proposed approach aims to generate optimized text prompts to guide the image synthesis process by leveraging the available contextual information in a few text examples. The results revealed that synthesized images using the proposed prompt optimization method achieved a higher performance, at 18% on average, based on an evaluation metric that measures the similarity between the generated images and prompts for generation. The significance of this research lies in its potential to provide a more efficient and automated approach to obtaining high-quality synthesized images. The findings indicate that prompt optimization may offer a promising pathway for text-to-image generative models.
first_indexed 2024-03-08T15:54:28Z
format Article
id doaj.art-362edabda0ec42248c87aefde433fae1
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-08T15:54:28Z
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-362edabda0ec42248c87aefde433fae12024-01-09T00:04:27ZengIEEEIEEE Access2169-35362024-01-01122660267310.1109/ACCESS.2023.334877810378642Optimizing Prompts Using In-Context Few-Shot Learning for Text-to-Image Generative ModelsSeunghun Lee0https://orcid.org/0009-0004-9419-448XJihoon Lee1https://orcid.org/0009-0001-6665-3739Chan Ho Bae2https://orcid.org/0009-0007-6620-6641Myung-Seok Choi3https://orcid.org/0000-0003-4821-3390Ryong Lee4https://orcid.org/0000-0001-5142-6106Sangtae Ahn5https://orcid.org/0000-0001-9487-5649School of Electronic and Electrical Engineering, Kyungpook National University, Daegu, South KoreaSchool of Electronic and Electrical Engineering, Kyungpook National University, Daegu, South KoreaSchool of Electronic and Electrical Engineering, Kyungpook National University, Daegu, South KoreaAI Data Research Center, Korea Institute of Science and Technology Information (KISTI), Daejeon, South KoreaAI Data Research Center, Korea Institute of Science and Technology Information (KISTI), Daejeon, South KoreaSchool of Electronic and Electrical Engineering, Kyungpook National University, Daegu, South KoreaRecently, various text-to-image generative models have been released, demonstrating their ability to generate high-quality synthesized images from text prompts. Despite these advancements, determining the appropriate text prompts to obtain desired images remains challenging. The quality of the synthesized images heavily depends on the user input, making it difficult to achieve consistent and satisfactory results. This limitation has sparked the need for an effective prompt optimization method to generate optimized text prompts automatically for text-to-image generative models. Thus, this study proposes a prompt optimization method that uses in-context few-shot learning in a pretrained language model. The proposed approach aims to generate optimized text prompts to guide the image synthesis process by leveraging the available contextual information in a few text examples. The results revealed that synthesized images using the proposed prompt optimization method achieved a higher performance, at 18% on average, based on an evaluation metric that measures the similarity between the generated images and prompts for generation. The significance of this research lies in its potential to provide a more efficient and automated approach to obtaining high-quality synthesized images. The findings indicate that prompt optimization may offer a promising pathway for text-to-image generative models.https://ieeexplore.ieee.org/document/10378642/In-context few-shot learningpretrained language modelprompt optimizationtext-to-image generation
spellingShingle Seunghun Lee
Jihoon Lee
Chan Ho Bae
Myung-Seok Choi
Ryong Lee
Sangtae Ahn
Optimizing Prompts Using In-Context Few-Shot Learning for Text-to-Image Generative Models
IEEE Access
In-context few-shot learning
pretrained language model
prompt optimization
text-to-image generation
title Optimizing Prompts Using In-Context Few-Shot Learning for Text-to-Image Generative Models
title_full Optimizing Prompts Using In-Context Few-Shot Learning for Text-to-Image Generative Models
title_fullStr Optimizing Prompts Using In-Context Few-Shot Learning for Text-to-Image Generative Models
title_full_unstemmed Optimizing Prompts Using In-Context Few-Shot Learning for Text-to-Image Generative Models
title_short Optimizing Prompts Using In-Context Few-Shot Learning for Text-to-Image Generative Models
title_sort optimizing prompts using in context few shot learning for text to image generative models
topic In-context few-shot learning
pretrained language model
prompt optimization
text-to-image generation
url https://ieeexplore.ieee.org/document/10378642/
work_keys_str_mv AT seunghunlee optimizingpromptsusingincontextfewshotlearningfortexttoimagegenerativemodels
AT jihoonlee optimizingpromptsusingincontextfewshotlearningfortexttoimagegenerativemodels
AT chanhobae optimizingpromptsusingincontextfewshotlearningfortexttoimagegenerativemodels
AT myungseokchoi optimizingpromptsusingincontextfewshotlearningfortexttoimagegenerativemodels
AT ryonglee optimizingpromptsusingincontextfewshotlearningfortexttoimagegenerativemodels
AT sangtaeahn optimizingpromptsusingincontextfewshotlearningfortexttoimagegenerativemodels