Learning Robust Shape-Indexed Features for Facial Landmark Detection

In facial landmark detection, extracting shape-indexed features is widely applied in existing methods to impose shape constraint over landmarks. Commonly, these methods crop shape-indexed patches surrounding landmarks of a given initial shape. All landmarks are then detected jointly based on these p...

Full description

Bibliographic Details
Main Authors: Xintong Wan, Yifan Wu, Xiaoqiang Li
Format: Article
Language:English
Published: MDPI AG 2022-06-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/12/5828
_version_ 1827662734555086848
author Xintong Wan
Yifan Wu
Xiaoqiang Li
author_facet Xintong Wan
Yifan Wu
Xiaoqiang Li
author_sort Xintong Wan
collection DOAJ
description In facial landmark detection, extracting shape-indexed features is widely applied in existing methods to impose shape constraint over landmarks. Commonly, these methods crop shape-indexed patches surrounding landmarks of a given initial shape. All landmarks are then detected jointly based on these patches, with shape constraint naturally embedded in the regressor. However, there are still two remaining challenges that cause the degradation of these methods. First, the initial shape may seriously deviate from the ground truth when presented with a large pose, resulting in considerable noise in the shape-indexed features. Second, extracting local patch features is vulnerable to occlusions due to missing facial context information under severe occlusion. To address the issues above, this paper proposes a facial landmark detection algorithm named Sparse-To-Dense Network (STDN). First, STDN employs a lightweight network to detect sparse facial landmarks and forms a reinitialized shape, which can efficiently improve the quality of cropped patches when presented with large poses. Then, a group-relational module is used to exploit the inherent geometric relations of the face, which further enhances the shape constraint against occlusion. Our method achieves <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>4.64</mn><mo>%</mo></mrow></semantics></math></inline-formula> mean error with <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>1.97</mn><mo>%</mo></mrow></semantics></math></inline-formula> failure rate on COFW68 dataset, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>3.48</mn><mo>%</mo></mrow></semantics></math></inline-formula> mean error with <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>0.43</mn><mo>%</mo></mrow></semantics></math></inline-formula> failure rate on 300 W dataset and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>7.12</mn><mo>%</mo></mrow></semantics></math></inline-formula> mean error with <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>11.61</mn><mo>%</mo></mrow></semantics></math></inline-formula> failure rate on Masked 300 W dataset. The results demonstrate that STDN achieves outstanding performance in comparison to state-of-the-art methods, especially on occlusion datasets.
first_indexed 2024-03-10T00:33:03Z
format Article
id doaj.art-fad14c93c56b4ab8a40581c750a1feee
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T00:33:03Z
publishDate 2022-06-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-fad14c93c56b4ab8a40581c750a1feee2023-11-23T15:23:05ZengMDPI AGApplied Sciences2076-34172022-06-011212582810.3390/app12125828Learning Robust Shape-Indexed Features for Facial Landmark DetectionXintong Wan0Yifan Wu1Xiaoqiang Li2School of Computer Engineering and Science, Shanghai University, Shanghai 200444, ChinaSchool of Computer Engineering and Science, Shanghai University, Shanghai 200444, ChinaSchool of Computer Engineering and Science, Shanghai University, Shanghai 200444, ChinaIn facial landmark detection, extracting shape-indexed features is widely applied in existing methods to impose shape constraint over landmarks. Commonly, these methods crop shape-indexed patches surrounding landmarks of a given initial shape. All landmarks are then detected jointly based on these patches, with shape constraint naturally embedded in the regressor. However, there are still two remaining challenges that cause the degradation of these methods. First, the initial shape may seriously deviate from the ground truth when presented with a large pose, resulting in considerable noise in the shape-indexed features. Second, extracting local patch features is vulnerable to occlusions due to missing facial context information under severe occlusion. To address the issues above, this paper proposes a facial landmark detection algorithm named Sparse-To-Dense Network (STDN). First, STDN employs a lightweight network to detect sparse facial landmarks and forms a reinitialized shape, which can efficiently improve the quality of cropped patches when presented with large poses. Then, a group-relational module is used to exploit the inherent geometric relations of the face, which further enhances the shape constraint against occlusion. Our method achieves <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>4.64</mn><mo>%</mo></mrow></semantics></math></inline-formula> mean error with <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>1.97</mn><mo>%</mo></mrow></semantics></math></inline-formula> failure rate on COFW68 dataset, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>3.48</mn><mo>%</mo></mrow></semantics></math></inline-formula> mean error with <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>0.43</mn><mo>%</mo></mrow></semantics></math></inline-formula> failure rate on 300 W dataset and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>7.12</mn><mo>%</mo></mrow></semantics></math></inline-formula> mean error with <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>11.61</mn><mo>%</mo></mrow></semantics></math></inline-formula> failure rate on Masked 300 W dataset. The results demonstrate that STDN achieves outstanding performance in comparison to state-of-the-art methods, especially on occlusion datasets.https://www.mdpi.com/2076-3417/12/12/5828facial landmark detectionshape-indexed featureface shape constraintbiometrics
spellingShingle Xintong Wan
Yifan Wu
Xiaoqiang Li
Learning Robust Shape-Indexed Features for Facial Landmark Detection
Applied Sciences
facial landmark detection
shape-indexed feature
face shape constraint
biometrics
title Learning Robust Shape-Indexed Features for Facial Landmark Detection
title_full Learning Robust Shape-Indexed Features for Facial Landmark Detection
title_fullStr Learning Robust Shape-Indexed Features for Facial Landmark Detection
title_full_unstemmed Learning Robust Shape-Indexed Features for Facial Landmark Detection
title_short Learning Robust Shape-Indexed Features for Facial Landmark Detection
title_sort learning robust shape indexed features for facial landmark detection
topic facial landmark detection
shape-indexed feature
face shape constraint
biometrics
url https://www.mdpi.com/2076-3417/12/12/5828
work_keys_str_mv AT xintongwan learningrobustshapeindexedfeaturesforfaciallandmarkdetection
AT yifanwu learningrobustshapeindexedfeaturesforfaciallandmarkdetection
AT xiaoqiangli learningrobustshapeindexedfeaturesforfaciallandmarkdetection