Human face segmentation and feature learning
This report documents the application of two models of segmentation, namely semantic and instance segmentation, to the purpose of segmenting human faces. It details the training of both models, showcases results of the trained model, and explains explicitly wherever necessary the inner workings of s...
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project (FYP) |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/75288 |
_version_ | 1811680978944393216 |
---|---|
author | Winarta, Ferdian |
author2 | Gwee Bah Hwee |
author_facet | Gwee Bah Hwee Winarta, Ferdian |
author_sort | Winarta, Ferdian |
collection | NTU |
description | This report documents the application of two models of segmentation, namely semantic and instance segmentation, to the purpose of segmenting human faces. It details the training of both models, showcases results of the trained model, and explains explicitly wherever necessary the inner workings of such models in processing raw inputs into segmented outputs.
Split into two major parts, the report deals, within its first part, with semantic segmentation applied to two batches of images, firstly of single human faces and secondly of multiple human faces. Next, the report provides results of experiments aimed at evaluating possible causes for shortcomings of the trained model, such as the presence of headgear, adjacency of faces, angle of faces, etc. Saliency maps are then created to investigate whether specific features of the human face bear significance on segmentation results.
The second part of the report is concerned with the state-of-the-art instance segmentation model, Mask R-CNN. Firstly, the report details how such a model is trained to detect and segment, and at the same time distinguish between, multiple instances of human faces. Then it seeks to explain the step-by-step process of bounding box detection and mask segmentation in such results: the inner workings of the RPN, assignment of class IDs and class confidence scores, non-max suppression (NMS), etc. Finally, it details evaluation results of intersection-over-union (IoU) and pixel accuracies of segmentation masks. The mean average precision of bounding boxes was calculated to be 0.912. The average intersection-over-union accuracy was found to be 0.796 and the average pixel accuracy to be 0.844. |
first_indexed | 2024-10-01T03:33:39Z |
format | Final Year Project (FYP) |
id | ntu-10356/75288 |
institution | Nanyang Technological University |
language | English |
last_indexed | 2024-10-01T03:33:39Z |
publishDate | 2018 |
record_format | dspace |
spelling | ntu-10356/752882023-07-07T17:19:04Z Human face segmentation and feature learning Winarta, Ferdian Gwee Bah Hwee School of Electrical and Electronic Engineering DRNTU::Engineering This report documents the application of two models of segmentation, namely semantic and instance segmentation, to the purpose of segmenting human faces. It details the training of both models, showcases results of the trained model, and explains explicitly wherever necessary the inner workings of such models in processing raw inputs into segmented outputs. Split into two major parts, the report deals, within its first part, with semantic segmentation applied to two batches of images, firstly of single human faces and secondly of multiple human faces. Next, the report provides results of experiments aimed at evaluating possible causes for shortcomings of the trained model, such as the presence of headgear, adjacency of faces, angle of faces, etc. Saliency maps are then created to investigate whether specific features of the human face bear significance on segmentation results. The second part of the report is concerned with the state-of-the-art instance segmentation model, Mask R-CNN. Firstly, the report details how such a model is trained to detect and segment, and at the same time distinguish between, multiple instances of human faces. Then it seeks to explain the step-by-step process of bounding box detection and mask segmentation in such results: the inner workings of the RPN, assignment of class IDs and class confidence scores, non-max suppression (NMS), etc. Finally, it details evaluation results of intersection-over-union (IoU) and pixel accuracies of segmentation masks. The mean average precision of bounding boxes was calculated to be 0.912. The average intersection-over-union accuracy was found to be 0.796 and the average pixel accuracy to be 0.844. Bachelor of Engineering 2018-05-30T07:37:32Z 2018-05-30T07:37:32Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/75288 en Nanyang Technological University 55 p. application/pdf |
spellingShingle | DRNTU::Engineering Winarta, Ferdian Human face segmentation and feature learning |
title | Human face segmentation and feature learning |
title_full | Human face segmentation and feature learning |
title_fullStr | Human face segmentation and feature learning |
title_full_unstemmed | Human face segmentation and feature learning |
title_short | Human face segmentation and feature learning |
title_sort | human face segmentation and feature learning |
topic | DRNTU::Engineering |
url | http://hdl.handle.net/10356/75288 |
work_keys_str_mv | AT winartaferdian humanfacesegmentationandfeaturelearning |