Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model

Human hair segmentation is essential for face recognition and for achieving natural transformation of style transfer. However, it remains a challenging task due to the diverse appearances and complex patterns of hair in image. In this study, we propose a novel method utilizing diffusion-based genera...

Full description

Bibliographic Details
Main Authors: Dohyun Kim, Euna Lee, Daehyun Yoo, Hongchul Lee
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10403896/
_version_ 1797338511581380608
author Dohyun Kim
Euna Lee
Daehyun Yoo
Hongchul Lee
author_facet Dohyun Kim
Euna Lee
Daehyun Yoo
Hongchul Lee
author_sort Dohyun Kim
collection DOAJ
description Human hair segmentation is essential for face recognition and for achieving natural transformation of style transfer. However, it remains a challenging task due to the diverse appearances and complex patterns of hair in image. In this study, we propose a novel method utilizing diffusion-based generative models, which have been extensively researched in recent times, to effectively capture and to finely segment human hair. In diffusion-based models, an internal visual representation during the denoising process contains pixel-level rich information. Inspired by this aspect, we introduce diffusion-based models for segmenting fine-grained human hair. Specifically, we extract the representation from the diffusion-based models, which contains pixel-level semantic information, and then train a segmentation network using it. Particularly, to more finely segment human hair, our approach employs the representation from a text-to-image diffusion model, conditioned on text information, to extract more relevant information for human hair, thereby predicting detailed hair masks. To validate our method, we conducted experiments on three distinct hair-related datasets with unique characteristics: Figaro-1k, CelebAMask-HQ, and Face Synthetics. The experimental results show the improved performance of our proposed method across all three datasets, outperforming existing methods in terms of mIoU (mean intersection over union), accuracy, precision, and F1-score. This is particularly evident in its ability to accurately capture and finely segment human hair from background and non-hair elements. This demonstrates the effectiveness of our method in accurately and finely segmenting human hair with complex characteristics. Our research contributes not only to the fine-grained segmentation of human hair but also to the application of generative models in semantic segmentation tasks. We hope that the proposed method will be applied for detailed semantic segmentation in various fields in the future.
first_indexed 2024-03-08T09:32:19Z
format Article
id doaj.art-ca6f37300da54fc99fe1d10572270a18
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-08T09:32:19Z
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-ca6f37300da54fc99fe1d10572270a182024-01-31T00:01:24ZengIEEEIEEE Access2169-35362024-01-0112139121392210.1109/ACCESS.2024.335554210403896Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion ModelDohyun Kim0https://orcid.org/0009-0002-8617-6543Euna Lee1Daehyun Yoo2Hongchul Lee3https://orcid.org/0000-0002-4407-0348School of Industrial Management Engineering, Korea University, Seoul, Republic of KoreaCenter for Defense Resource Management, Korea Institute for Defense Analyses, Seoul, Republic of KoreaSchool of Industrial Management Engineering, Korea University, Seoul, Republic of KoreaSchool of Industrial Management Engineering, Korea University, Seoul, Republic of KoreaHuman hair segmentation is essential for face recognition and for achieving natural transformation of style transfer. However, it remains a challenging task due to the diverse appearances and complex patterns of hair in image. In this study, we propose a novel method utilizing diffusion-based generative models, which have been extensively researched in recent times, to effectively capture and to finely segment human hair. In diffusion-based models, an internal visual representation during the denoising process contains pixel-level rich information. Inspired by this aspect, we introduce diffusion-based models for segmenting fine-grained human hair. Specifically, we extract the representation from the diffusion-based models, which contains pixel-level semantic information, and then train a segmentation network using it. Particularly, to more finely segment human hair, our approach employs the representation from a text-to-image diffusion model, conditioned on text information, to extract more relevant information for human hair, thereby predicting detailed hair masks. To validate our method, we conducted experiments on three distinct hair-related datasets with unique characteristics: Figaro-1k, CelebAMask-HQ, and Face Synthetics. The experimental results show the improved performance of our proposed method across all three datasets, outperforming existing methods in terms of mIoU (mean intersection over union), accuracy, precision, and F1-score. This is particularly evident in its ability to accurately capture and finely segment human hair from background and non-hair elements. This demonstrates the effectiveness of our method in accurately and finely segmenting human hair with complex characteristics. Our research contributes not only to the fine-grained segmentation of human hair but also to the application of generative models in semantic segmentation tasks. We hope that the proposed method will be applied for detailed semantic segmentation in various fields in the future.https://ieeexplore.ieee.org/document/10403896/Hair segmentationfine-grained segmentationgenerative modeldiffusion modeltext-to-image diffusion modelFigaro-1k
spellingShingle Dohyun Kim
Euna Lee
Daehyun Yoo
Hongchul Lee
Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model
IEEE Access
Hair segmentation
fine-grained segmentation
generative model
diffusion model
text-to-image diffusion model
Figaro-1k
title Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model
title_full Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model
title_fullStr Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model
title_full_unstemmed Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model
title_short Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model
title_sort fine grained human hair segmentation using a text to image diffusion model
topic Hair segmentation
fine-grained segmentation
generative model
diffusion model
text-to-image diffusion model
Figaro-1k
url https://ieeexplore.ieee.org/document/10403896/
work_keys_str_mv AT dohyunkim finegrainedhumanhairsegmentationusingatexttoimagediffusionmodel
AT eunalee finegrainedhumanhairsegmentationusingatexttoimagediffusionmodel
AT daehyunyoo finegrainedhumanhairsegmentationusingatexttoimagediffusionmodel
AT hongchullee finegrainedhumanhairsegmentationusingatexttoimagediffusionmodel