Customized image synthesis using diffusion models

Recently, diffusion models have become a powerful mainstream method for image generation. Text-to-image diffusion models, in particular, have been widely used to convert a natural language description (e.g., ‘an orange cat’) to photorealistic images (e.g., a photo of an orange cat). These pre-tra...

Full description

Bibliographic Details
Main Author:	Fu, Guanqiao
Other Authors:	Liu Ziwei
Format:	Final Year Project (FYP)
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Computer and Information Science Diffusion model
Online Access:	https://hdl.handle.net/10356/175199

_version_	1811677997021790208
author	Fu, Guanqiao
author2	Liu Ziwei
author_facet	Liu Ziwei Fu, Guanqiao
author_sort	Fu, Guanqiao
collection	NTU
description	Recently, diffusion models have become a powerful mainstream method for image generation. Text-to-image diffusion models, in particular, have been widely used to convert a natural language description (e.g., ‘an orange cat’) to photorealistic images (e.g., a photo of an orange cat). These pre-trained diffusion models have enabled various downstream applications, including customized image synthesis. For instance, a pre-trained text-to-image diffusion model can be leveraged to capture the appearance of a specific cat from multiple images, and subsequently generate images of this cat in diverse scenarios. In this final year project, we introduce an integration pipeline for storyboard generation. We begin by using large language models to assist in the creation of storylines, followed by the application of existing customization methods to visually render each scene. The pipeline is carefully designed to leverage both language models and customizastion methods for efficient and effective storyboard generation. We demonstrate the usefulness of our proposed pipeline both qualitatively and quantitatively. Additionally, a comprehensive research is also proposed focus on several diffusion models related to the latest advancements in customized image synthesis, which experimentally compare and analyze various diffusion models. We believe this project can enable and inspire subsequent explorations on applying customized image synthesis methods for automatic storyboard generation.
first_indexed	2024-10-01T02:46:15Z
format	Final Year Project (FYP)
id	ntu-10356/175199
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T02:46:15Z
publishDate	2024
publisher	Nanyang Technological University
record_format	dspace
spelling	ntu-10356/1751992024-04-19T15:42:52Z Customized image synthesis using diffusion models Fu, Guanqiao Liu Ziwei School of Computer Science and Engineering ziwei.liu@ntu.edu.sg Computer and Information Science Diffusion model Recently, diffusion models have become a powerful mainstream method for image generation. Text-to-image diffusion models, in particular, have been widely used to convert a natural language description (e.g., ‘an orange cat’) to photorealistic images (e.g., a photo of an orange cat). These pre-trained diffusion models have enabled various downstream applications, including customized image synthesis. For instance, a pre-trained text-to-image diffusion model can be leveraged to capture the appearance of a specific cat from multiple images, and subsequently generate images of this cat in diverse scenarios. In this final year project, we introduce an integration pipeline for storyboard generation. We begin by using large language models to assist in the creation of storylines, followed by the application of existing customization methods to visually render each scene. The pipeline is carefully designed to leverage both language models and customizastion methods for efficient and effective storyboard generation. We demonstrate the usefulness of our proposed pipeline both qualitatively and quantitatively. Additionally, a comprehensive research is also proposed focus on several diffusion models related to the latest advancements in customized image synthesis, which experimentally compare and analyze various diffusion models. We believe this project can enable and inspire subsequent explorations on applying customized image synthesis methods for automatic storyboard generation. Bachelor's degree 2024-04-19T13:18:32Z 2024-04-19T13:18:32Z 2024 Final Year Project (FYP) Fu, G. (2024). Customized image synthesis using diffusion models. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175199 https://hdl.handle.net/10356/175199 en application/pdf Nanyang Technological University
spellingShingle	Computer and Information Science Diffusion model Fu, Guanqiao Customized image synthesis using diffusion models
title	Customized image synthesis using diffusion models
title_full	Customized image synthesis using diffusion models
title_fullStr	Customized image synthesis using diffusion models
title_full_unstemmed	Customized image synthesis using diffusion models
title_short	Customized image synthesis using diffusion models
title_sort	customized image synthesis using diffusion models
topic	Computer and Information Science Diffusion model
url	https://hdl.handle.net/10356/175199
work_keys_str_mv	AT fuguanqiao customizedimagesynthesisusingdiffusionmodels

Customized image synthesis using diffusion models

Similar Items