Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model
Human hair segmentation is essential for face recognition and for achieving natural transformation of style transfer. However, it remains a challenging task due to the diverse appearances and complex patterns of hair in image. In this study, we propose a novel method utilizing diffusion-based genera...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2024-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10403896/ |
_version_ | 1797338511581380608 |
---|---|
author | Dohyun Kim Euna Lee Daehyun Yoo Hongchul Lee |
author_facet | Dohyun Kim Euna Lee Daehyun Yoo Hongchul Lee |
author_sort | Dohyun Kim |
collection | DOAJ |
description | Human hair segmentation is essential for face recognition and for achieving natural transformation of style transfer. However, it remains a challenging task due to the diverse appearances and complex patterns of hair in image. In this study, we propose a novel method utilizing diffusion-based generative models, which have been extensively researched in recent times, to effectively capture and to finely segment human hair. In diffusion-based models, an internal visual representation during the denoising process contains pixel-level rich information. Inspired by this aspect, we introduce diffusion-based models for segmenting fine-grained human hair. Specifically, we extract the representation from the diffusion-based models, which contains pixel-level semantic information, and then train a segmentation network using it. Particularly, to more finely segment human hair, our approach employs the representation from a text-to-image diffusion model, conditioned on text information, to extract more relevant information for human hair, thereby predicting detailed hair masks. To validate our method, we conducted experiments on three distinct hair-related datasets with unique characteristics: Figaro-1k, CelebAMask-HQ, and Face Synthetics. The experimental results show the improved performance of our proposed method across all three datasets, outperforming existing methods in terms of mIoU (mean intersection over union), accuracy, precision, and F1-score. This is particularly evident in its ability to accurately capture and finely segment human hair from background and non-hair elements. This demonstrates the effectiveness of our method in accurately and finely segmenting human hair with complex characteristics. Our research contributes not only to the fine-grained segmentation of human hair but also to the application of generative models in semantic segmentation tasks. We hope that the proposed method will be applied for detailed semantic segmentation in various fields in the future. |
first_indexed | 2024-03-08T09:32:19Z |
format | Article |
id | doaj.art-ca6f37300da54fc99fe1d10572270a18 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-03-08T09:32:19Z |
publishDate | 2024-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-ca6f37300da54fc99fe1d10572270a182024-01-31T00:01:24ZengIEEEIEEE Access2169-35362024-01-0112139121392210.1109/ACCESS.2024.335554210403896Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion ModelDohyun Kim0https://orcid.org/0009-0002-8617-6543Euna Lee1Daehyun Yoo2Hongchul Lee3https://orcid.org/0000-0002-4407-0348School of Industrial Management Engineering, Korea University, Seoul, Republic of KoreaCenter for Defense Resource Management, Korea Institute for Defense Analyses, Seoul, Republic of KoreaSchool of Industrial Management Engineering, Korea University, Seoul, Republic of KoreaSchool of Industrial Management Engineering, Korea University, Seoul, Republic of KoreaHuman hair segmentation is essential for face recognition and for achieving natural transformation of style transfer. However, it remains a challenging task due to the diverse appearances and complex patterns of hair in image. In this study, we propose a novel method utilizing diffusion-based generative models, which have been extensively researched in recent times, to effectively capture and to finely segment human hair. In diffusion-based models, an internal visual representation during the denoising process contains pixel-level rich information. Inspired by this aspect, we introduce diffusion-based models for segmenting fine-grained human hair. Specifically, we extract the representation from the diffusion-based models, which contains pixel-level semantic information, and then train a segmentation network using it. Particularly, to more finely segment human hair, our approach employs the representation from a text-to-image diffusion model, conditioned on text information, to extract more relevant information for human hair, thereby predicting detailed hair masks. To validate our method, we conducted experiments on three distinct hair-related datasets with unique characteristics: Figaro-1k, CelebAMask-HQ, and Face Synthetics. The experimental results show the improved performance of our proposed method across all three datasets, outperforming existing methods in terms of mIoU (mean intersection over union), accuracy, precision, and F1-score. This is particularly evident in its ability to accurately capture and finely segment human hair from background and non-hair elements. This demonstrates the effectiveness of our method in accurately and finely segmenting human hair with complex characteristics. Our research contributes not only to the fine-grained segmentation of human hair but also to the application of generative models in semantic segmentation tasks. We hope that the proposed method will be applied for detailed semantic segmentation in various fields in the future.https://ieeexplore.ieee.org/document/10403896/Hair segmentationfine-grained segmentationgenerative modeldiffusion modeltext-to-image diffusion modelFigaro-1k |
spellingShingle | Dohyun Kim Euna Lee Daehyun Yoo Hongchul Lee Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model IEEE Access Hair segmentation fine-grained segmentation generative model diffusion model text-to-image diffusion model Figaro-1k |
title | Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model |
title_full | Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model |
title_fullStr | Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model |
title_full_unstemmed | Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model |
title_short | Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model |
title_sort | fine grained human hair segmentation using a text to image diffusion model |
topic | Hair segmentation fine-grained segmentation generative model diffusion model text-to-image diffusion model Figaro-1k |
url | https://ieeexplore.ieee.org/document/10403896/ |
work_keys_str_mv | AT dohyunkim finegrainedhumanhairsegmentationusingatexttoimagediffusionmodel AT eunalee finegrainedhumanhairsegmentationusingatexttoimagediffusionmodel AT daehyunyoo finegrainedhumanhairsegmentationusingatexttoimagediffusionmodel AT hongchullee finegrainedhumanhairsegmentationusingatexttoimagediffusionmodel |