Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model

Human hair segmentation is essential for face recognition and for achieving natural transformation of style transfer. However, it remains a challenging task due to the diverse appearances and complex patterns of hair in image. In this study, we propose a novel method utilizing diffusion-based genera...

Full description

Bibliographic Details
Main Authors:	Dohyun Kim, Euna Lee, Daehyun Yoo, Hongchul Lee
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Access
Subjects:	Hair segmentation fine-grained segmentation generative model diffusion model text-to-image diffusion model Figaro-1k
Online Access:	https://ieeexplore.ieee.org/document/10403896/

_version_	1797338511581380608
author	Dohyun Kim Euna Lee Daehyun Yoo Hongchul Lee
author_facet	Dohyun Kim Euna Lee Daehyun Yoo Hongchul Lee
author_sort	Dohyun Kim
collection	DOAJ
description	Human hair segmentation is essential for face recognition and for achieving natural transformation of style transfer. However, it remains a challenging task due to the diverse appearances and complex patterns of hair in image. In this study, we propose a novel method utilizing diffusion-based generative models, which have been extensively researched in recent times, to effectively capture and to finely segment human hair. In diffusion-based models, an internal visual representation during the denoising process contains pixel-level rich information. Inspired by this aspect, we introduce diffusion-based models for segmenting fine-grained human hair. Specifically, we extract the representation from the diffusion-based models, which contains pixel-level semantic information, and then train a segmentation network using it. Particularly, to more finely segment human hair, our approach employs the representation from a text-to-image diffusion model, conditioned on text information, to extract more relevant information for human hair, thereby predicting detailed hair masks. To validate our method, we conducted experiments on three distinct hair-related datasets with unique characteristics: Figaro-1k, CelebAMask-HQ, and Face Synthetics. The experimental results show the improved performance of our proposed method across all three datasets, outperforming existing methods in terms of mIoU (mean intersection over union), accuracy, precision, and F1-score. This is particularly evident in its ability to accurately capture and finely segment human hair from background and non-hair elements. This demonstrates the effectiveness of our method in accurately and finely segmenting human hair with complex characteristics. Our research contributes not only to the fine-grained segmentation of human hair but also to the application of generative models in semantic segmentation tasks. We hope that the proposed method will be applied for detailed semantic segmentation in various fields in the future.
first_indexed	2024-03-08T09:32:19Z
format	Article
id	doaj.art-ca6f37300da54fc99fe1d10572270a18
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-03-08T09:32:19Z
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-ca6f37300da54fc99fe1d10572270a182024-01-31T00:01:24ZengIEEEIEEE Access2169-35362024-01-0112139121392210.1109/ACCESS.2024.335554210403896Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion ModelDohyun Kim0https://orcid.org/0009-0002-8617-6543Euna Lee1Daehyun Yoo2Hongchul Lee3https://orcid.org/0000-0002-4407-0348School of Industrial Management Engineering, Korea University, Seoul, Republic of KoreaCenter for Defense Resource Management, Korea Institute for Defense Analyses, Seoul, Republic of KoreaSchool of Industrial Management Engineering, Korea University, Seoul, Republic of KoreaSchool of Industrial Management Engineering, Korea University, Seoul, Republic of KoreaHuman hair segmentation is essential for face recognition and for achieving natural transformation of style transfer. However, it remains a challenging task due to the diverse appearances and complex patterns of hair in image. In this study, we propose a novel method utilizing diffusion-based generative models, which have been extensively researched in recent times, to effectively capture and to finely segment human hair. In diffusion-based models, an internal visual representation during the denoising process contains pixel-level rich information. Inspired by this aspect, we introduce diffusion-based models for segmenting fine-grained human hair. Specifically, we extract the representation from the diffusion-based models, which contains pixel-level semantic information, and then train a segmentation network using it. Particularly, to more finely segment human hair, our approach employs the representation from a text-to-image diffusion model, conditioned on text information, to extract more relevant information for human hair, thereby predicting detailed hair masks. To validate our method, we conducted experiments on three distinct hair-related datasets with unique characteristics: Figaro-1k, CelebAMask-HQ, and Face Synthetics. The experimental results show the improved performance of our proposed method across all three datasets, outperforming existing methods in terms of mIoU (mean intersection over union), accuracy, precision, and F1-score. This is particularly evident in its ability to accurately capture and finely segment human hair from background and non-hair elements. This demonstrates the effectiveness of our method in accurately and finely segmenting human hair with complex characteristics. Our research contributes not only to the fine-grained segmentation of human hair but also to the application of generative models in semantic segmentation tasks. We hope that the proposed method will be applied for detailed semantic segmentation in various fields in the future.https://ieeexplore.ieee.org/document/10403896/Hair segmentationfine-grained segmentationgenerative modeldiffusion modeltext-to-image diffusion modelFigaro-1k
spellingShingle	Dohyun Kim Euna Lee Daehyun Yoo Hongchul Lee Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model IEEE Access Hair segmentation fine-grained segmentation generative model diffusion model text-to-image diffusion model Figaro-1k
title	Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model
title_full	Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model
title_fullStr	Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model
title_full_unstemmed	Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model
title_short	Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model
title_sort	fine grained human hair segmentation using a text to image diffusion model
topic	Hair segmentation fine-grained segmentation generative model diffusion model text-to-image diffusion model Figaro-1k
url	https://ieeexplore.ieee.org/document/10403896/
work_keys_str_mv	AT dohyunkim finegrainedhumanhairsegmentationusingatexttoimagediffusionmodel AT eunalee finegrainedhumanhairsegmentationusingatexttoimagediffusionmodel AT daehyunyoo finegrainedhumanhairsegmentationusingatexttoimagediffusionmodel AT hongchullee finegrainedhumanhairsegmentationusingatexttoimagediffusionmodel

Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model

Similar Items