Human hair segmentation is essential for face recognition and for achieving natural transformation of style transfer. However, it remains a challenging task due to the diverse appearances and complex patterns of hair in image. In this study, we propose a novel method utilizing diffusion-based generative models, which have been extensively researched in recent times, to effectively capture and to finely segment human hair. In diffusion-based models, an internal visual representation during the denoising process contains pixel-level rich information. Inspired by this aspect, we introduce diffusion-based models for segmenting fine-grained human hair. Specifically, we extract the representation from the diffusion-based models, which contains pixel-level semantic information, and then train a segmentation network using it. Particularly, to more finely segment human hair, our approach employs the representation from a text-toimage diffusion model, conditioned on text information, to extract more relevant information for human hair, thereby predicting detailed hair masks. To validate our method, we conducted experiments on three distinct hair-related datasets with unique characteristics: Figaro-1k, CelebAMask-HQ, and Face Synthetics. The experimental results show the improved performance of our proposed method across all three datasets, outperforming existing methods in terms of mIoU (mean intersection over union), accuracy, precision, and F1-score. This is particularly evident in its ability to accurately capture and finely segment human hair from background and non-hair elements. This demonstrates the effectiveness of our method in accurately and finely segmenting human hair with complex characteristics. Our research contributes not only to the fine-grained segmentation of human hair but also to the application of generative models in semantic segmentation tasks. We hope that the proposed method will be applied for detailed semantic segmentation in various fields in the future.