Learning Robust Shape-Indexed Features for Facial Landmark Detection
In facial landmark detection, extracting shape-indexed features is widely applied in existing methods to impose shape constraint over landmarks. Commonly, these methods crop shape-indexed patches surrounding landmarks of a given initial shape. All landmarks are then detected jointly based on these p...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-06-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/12/12/5828 |
_version_ | 1827662734555086848 |
---|---|
author | Xintong Wan Yifan Wu Xiaoqiang Li |
author_facet | Xintong Wan Yifan Wu Xiaoqiang Li |
author_sort | Xintong Wan |
collection | DOAJ |
description | In facial landmark detection, extracting shape-indexed features is widely applied in existing methods to impose shape constraint over landmarks. Commonly, these methods crop shape-indexed patches surrounding landmarks of a given initial shape. All landmarks are then detected jointly based on these patches, with shape constraint naturally embedded in the regressor. However, there are still two remaining challenges that cause the degradation of these methods. First, the initial shape may seriously deviate from the ground truth when presented with a large pose, resulting in considerable noise in the shape-indexed features. Second, extracting local patch features is vulnerable to occlusions due to missing facial context information under severe occlusion. To address the issues above, this paper proposes a facial landmark detection algorithm named Sparse-To-Dense Network (STDN). First, STDN employs a lightweight network to detect sparse facial landmarks and forms a reinitialized shape, which can efficiently improve the quality of cropped patches when presented with large poses. Then, a group-relational module is used to exploit the inherent geometric relations of the face, which further enhances the shape constraint against occlusion. Our method achieves <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>4.64</mn><mo>%</mo></mrow></semantics></math></inline-formula> mean error with <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>1.97</mn><mo>%</mo></mrow></semantics></math></inline-formula> failure rate on COFW68 dataset, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>3.48</mn><mo>%</mo></mrow></semantics></math></inline-formula> mean error with <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>0.43</mn><mo>%</mo></mrow></semantics></math></inline-formula> failure rate on 300 W dataset and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>7.12</mn><mo>%</mo></mrow></semantics></math></inline-formula> mean error with <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>11.61</mn><mo>%</mo></mrow></semantics></math></inline-formula> failure rate on Masked 300 W dataset. The results demonstrate that STDN achieves outstanding performance in comparison to state-of-the-art methods, especially on occlusion datasets. |
first_indexed | 2024-03-10T00:33:03Z |
format | Article |
id | doaj.art-fad14c93c56b4ab8a40581c750a1feee |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T00:33:03Z |
publishDate | 2022-06-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-fad14c93c56b4ab8a40581c750a1feee2023-11-23T15:23:05ZengMDPI AGApplied Sciences2076-34172022-06-011212582810.3390/app12125828Learning Robust Shape-Indexed Features for Facial Landmark DetectionXintong Wan0Yifan Wu1Xiaoqiang Li2School of Computer Engineering and Science, Shanghai University, Shanghai 200444, ChinaSchool of Computer Engineering and Science, Shanghai University, Shanghai 200444, ChinaSchool of Computer Engineering and Science, Shanghai University, Shanghai 200444, ChinaIn facial landmark detection, extracting shape-indexed features is widely applied in existing methods to impose shape constraint over landmarks. Commonly, these methods crop shape-indexed patches surrounding landmarks of a given initial shape. All landmarks are then detected jointly based on these patches, with shape constraint naturally embedded in the regressor. However, there are still two remaining challenges that cause the degradation of these methods. First, the initial shape may seriously deviate from the ground truth when presented with a large pose, resulting in considerable noise in the shape-indexed features. Second, extracting local patch features is vulnerable to occlusions due to missing facial context information under severe occlusion. To address the issues above, this paper proposes a facial landmark detection algorithm named Sparse-To-Dense Network (STDN). First, STDN employs a lightweight network to detect sparse facial landmarks and forms a reinitialized shape, which can efficiently improve the quality of cropped patches when presented with large poses. Then, a group-relational module is used to exploit the inherent geometric relations of the face, which further enhances the shape constraint against occlusion. Our method achieves <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>4.64</mn><mo>%</mo></mrow></semantics></math></inline-formula> mean error with <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>1.97</mn><mo>%</mo></mrow></semantics></math></inline-formula> failure rate on COFW68 dataset, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>3.48</mn><mo>%</mo></mrow></semantics></math></inline-formula> mean error with <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>0.43</mn><mo>%</mo></mrow></semantics></math></inline-formula> failure rate on 300 W dataset and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>7.12</mn><mo>%</mo></mrow></semantics></math></inline-formula> mean error with <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>11.61</mn><mo>%</mo></mrow></semantics></math></inline-formula> failure rate on Masked 300 W dataset. The results demonstrate that STDN achieves outstanding performance in comparison to state-of-the-art methods, especially on occlusion datasets.https://www.mdpi.com/2076-3417/12/12/5828facial landmark detectionshape-indexed featureface shape constraintbiometrics |
spellingShingle | Xintong Wan Yifan Wu Xiaoqiang Li Learning Robust Shape-Indexed Features for Facial Landmark Detection Applied Sciences facial landmark detection shape-indexed feature face shape constraint biometrics |
title | Learning Robust Shape-Indexed Features for Facial Landmark Detection |
title_full | Learning Robust Shape-Indexed Features for Facial Landmark Detection |
title_fullStr | Learning Robust Shape-Indexed Features for Facial Landmark Detection |
title_full_unstemmed | Learning Robust Shape-Indexed Features for Facial Landmark Detection |
title_short | Learning Robust Shape-Indexed Features for Facial Landmark Detection |
title_sort | learning robust shape indexed features for facial landmark detection |
topic | facial landmark detection shape-indexed feature face shape constraint biometrics |
url | https://www.mdpi.com/2076-3417/12/12/5828 |
work_keys_str_mv | AT xintongwan learningrobustshapeindexedfeaturesforfaciallandmarkdetection AT yifanwu learningrobustshapeindexedfeaturesforfaciallandmarkdetection AT xiaoqiangli learningrobustshapeindexedfeaturesforfaciallandmarkdetection |