Artfusion: A Diffusion Model-Based Style Synthesis Framework for Portraits

We present a diffusion model-based approach that applies the artistic style of an artist or an art movement to a portrait photograph. Learning the style from the artworks of an artist or an art movement requires a training dataset composed of a lot of samples. We resolve this limitation by combining...

Full description

Bibliographic Details
Main Authors:	Hyemin Yang, Heekyung Yang, Kyungha Min
Format:	Article
Language:	English
Published:	MDPI AG 2024-01-01
Series:	Electronics
Subjects:	stylization portrait stylization diffusion model CLIP encoder artist style art movement style
Online Access:	https://www.mdpi.com/2079-9292/13/3/509

_version_	1797318867922452480
author	Hyemin Yang Heekyung Yang Kyungha Min
author_facet	Hyemin Yang Heekyung Yang Kyungha Min
author_sort	Hyemin Yang
collection	DOAJ
description	We present a diffusion model-based approach that applies the artistic style of an artist or an art movement to a portrait photograph. Learning the style from the artworks of an artist or an art movement requires a training dataset composed of a lot of samples. We resolve this limitation by combining Contrastive Language Image Pretraining (CLIP) encoder and diffusion model, since the CLIP encoder extracts the features from an input portrait in a very effective way. Our framework includes three independent CLIP encoders that extract the text features, color features and Canny edge features from an input portrait, respectively. These features are incorporated to the style information extracted through a diffusion model to complete the stylization on an input portrait. The diffusion model extracts the style information from the sample images in the training dataset using an image encoder. The denoising steps in the diffusion model applies the style information from the training dataset to the CLIP-based features from an input portrait. Finally, our framework produces an artistic portrait that presents both the identity of the input portrait and the artistic style from the training dataset. The most important contribution of our framework is that our framework requires less than a hundred sample images for an artistic style. Therefore, our framework can successfully extract styles from an artist who has drawn less than a hundred artworks. We sample three artists and three art movements and apply these styles to the portraits of various identities and produce visually pleasing results. We evaluate our results using various metrics, including Frechet Inception Distance (FID), ArtFID and Language-Image Quality Evaluator (LIQE) to prove the excellence of our results.
first_indexed	2024-03-08T03:58:38Z
format	Article
id	doaj.art-658e9f673bc7477794a18eab6da5a37a
institution	Directory Open Access Journal
issn	2079-9292
language	English
last_indexed	2024-03-08T03:58:38Z
publishDate	2024-01-01
publisher	MDPI AG
record_format	Article
series	Electronics
spelling	doaj.art-658e9f673bc7477794a18eab6da5a37a2024-02-09T15:10:27ZengMDPI AGElectronics2079-92922024-01-0113350910.3390/electronics13030509Artfusion: A Diffusion Model-Based Style Synthesis Framework for PortraitsHyemin Yang0Heekyung Yang1Kyungha Min2Department of Computer Science, Sangmyung University, Seoul 03016, Republic of KoreaDepartment of Software, Sangmyung University, Cheonan 31066, Republic of KoreaDepartment of Computer Science, Sangmyung University, Seoul 03016, Republic of KoreaWe present a diffusion model-based approach that applies the artistic style of an artist or an art movement to a portrait photograph. Learning the style from the artworks of an artist or an art movement requires a training dataset composed of a lot of samples. We resolve this limitation by combining Contrastive Language Image Pretraining (CLIP) encoder and diffusion model, since the CLIP encoder extracts the features from an input portrait in a very effective way. Our framework includes three independent CLIP encoders that extract the text features, color features and Canny edge features from an input portrait, respectively. These features are incorporated to the style information extracted through a diffusion model to complete the stylization on an input portrait. The diffusion model extracts the style information from the sample images in the training dataset using an image encoder. The denoising steps in the diffusion model applies the style information from the training dataset to the CLIP-based features from an input portrait. Finally, our framework produces an artistic portrait that presents both the identity of the input portrait and the artistic style from the training dataset. The most important contribution of our framework is that our framework requires less than a hundred sample images for an artistic style. Therefore, our framework can successfully extract styles from an artist who has drawn less than a hundred artworks. We sample three artists and three art movements and apply these styles to the portraits of various identities and produce visually pleasing results. We evaluate our results using various metrics, including Frechet Inception Distance (FID), ArtFID and Language-Image Quality Evaluator (LIQE) to prove the excellence of our results.https://www.mdpi.com/2079-9292/13/3/509stylizationportrait stylizationdiffusion modelCLIP encoderartist styleart movement style
spellingShingle	Hyemin Yang Heekyung Yang Kyungha Min Artfusion: A Diffusion Model-Based Style Synthesis Framework for Portraits Electronics stylization portrait stylization diffusion model CLIP encoder artist style art movement style
title	Artfusion: A Diffusion Model-Based Style Synthesis Framework for Portraits
title_full	Artfusion: A Diffusion Model-Based Style Synthesis Framework for Portraits
title_fullStr	Artfusion: A Diffusion Model-Based Style Synthesis Framework for Portraits
title_full_unstemmed	Artfusion: A Diffusion Model-Based Style Synthesis Framework for Portraits
title_short	Artfusion: A Diffusion Model-Based Style Synthesis Framework for Portraits
title_sort	artfusion a diffusion model based style synthesis framework for portraits
topic	stylization portrait stylization diffusion model CLIP encoder artist style art movement style
url	https://www.mdpi.com/2079-9292/13/3/509
work_keys_str_mv	AT hyeminyang artfusionadiffusionmodelbasedstylesynthesisframeworkforportraits AT heekyungyang artfusionadiffusionmodelbasedstylesynthesisframeworkforportraits AT kyunghamin artfusionadiffusionmodelbasedstylesynthesisframeworkforportraits

Artfusion: A Diffusion Model-Based Style Synthesis Framework for Portraits

Similar Items