FreeGaze: A Framework for 3D Gaze Estimation Using Appearance Cues from a Facial Video
Gaze is a significant behavioral characteristic that can be used to reflect a person’s attention. In recent years, there has been a growing interest in estimating gaze from facial videos. However, gaze estimation remains a challenging problem due to variations in appearance and head poses. To addres...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-12-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/23/23/9604 |
_version_ | 1827591945524871168 |
---|---|
author | Shang Tian Haiyan Tu Ling He Yue Ivan Wu Xiujuan Zheng |
author_facet | Shang Tian Haiyan Tu Ling He Yue Ivan Wu Xiujuan Zheng |
author_sort | Shang Tian |
collection | DOAJ |
description | Gaze is a significant behavioral characteristic that can be used to reflect a person’s attention. In recent years, there has been a growing interest in estimating gaze from facial videos. However, gaze estimation remains a challenging problem due to variations in appearance and head poses. To address this, a framework for 3D gaze estimation using appearance cues is developed in this study. The framework begins with an end-to-end approach to detect facial landmarks. Subsequently, we employ a normalization method and improve the normalization method using orthogonal matrices and conduct comparative experiments to prove that the improved normalization method has a higher accuracy and a lower computational time in gaze estimation. Finally, we introduce a dual-branch convolutional neural network, named FG-Net, which processes the normalized images and extracts eye and face features through two branches. The extracted multi-features are then integrated and input into a fully connected layer to estimate the 3D gaze vectors. To evaluate the performance of our approach, we conduct ten-fold cross-validation experiments on two public datasets, namely MPIIGaze and EyeDiap, achieving remarkable accuracies of 3.11° and 2.75°, respectively. The results demonstrate the high effectiveness of our proposed framework, showcasing its state-of-the-art performance in 3D gaze estimation. |
first_indexed | 2024-03-09T01:41:48Z |
format | Article |
id | doaj.art-259982f297f2416e8b5df56e5d34f21c |
institution | Directory Open Access Journal |
issn | 1424-8220 |
language | English |
last_indexed | 2024-03-09T01:41:48Z |
publishDate | 2023-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Sensors |
spelling | doaj.art-259982f297f2416e8b5df56e5d34f21c2023-12-08T15:26:33ZengMDPI AGSensors1424-82202023-12-012323960410.3390/s23239604FreeGaze: A Framework for 3D Gaze Estimation Using Appearance Cues from a Facial VideoShang Tian0Haiyan Tu1Ling He2Yue Ivan Wu3Xiujuan Zheng4College of Electrical Engineering, Sichuan University, Chengdu 610065, ChinaCollege of Electrical Engineering, Sichuan University, Chengdu 610065, ChinaCollege of Biomedical Engineering, Sichuan University, Chengdu 610065, ChinaCollege of Computer Science, Sichuan University, Chengdu 610065, ChinaCollege of Electrical Engineering, Sichuan University, Chengdu 610065, ChinaGaze is a significant behavioral characteristic that can be used to reflect a person’s attention. In recent years, there has been a growing interest in estimating gaze from facial videos. However, gaze estimation remains a challenging problem due to variations in appearance and head poses. To address this, a framework for 3D gaze estimation using appearance cues is developed in this study. The framework begins with an end-to-end approach to detect facial landmarks. Subsequently, we employ a normalization method and improve the normalization method using orthogonal matrices and conduct comparative experiments to prove that the improved normalization method has a higher accuracy and a lower computational time in gaze estimation. Finally, we introduce a dual-branch convolutional neural network, named FG-Net, which processes the normalized images and extracts eye and face features through two branches. The extracted multi-features are then integrated and input into a fully connected layer to estimate the 3D gaze vectors. To evaluate the performance of our approach, we conduct ten-fold cross-validation experiments on two public datasets, namely MPIIGaze and EyeDiap, achieving remarkable accuracies of 3.11° and 2.75°, respectively. The results demonstrate the high effectiveness of our proposed framework, showcasing its state-of-the-art performance in 3D gaze estimation.https://www.mdpi.com/1424-8220/23/23/9604gaze estimationdual-branch CNNimproved normalizationeye featuresface features |
spellingShingle | Shang Tian Haiyan Tu Ling He Yue Ivan Wu Xiujuan Zheng FreeGaze: A Framework for 3D Gaze Estimation Using Appearance Cues from a Facial Video Sensors gaze estimation dual-branch CNN improved normalization eye features face features |
title | FreeGaze: A Framework for 3D Gaze Estimation Using Appearance Cues from a Facial Video |
title_full | FreeGaze: A Framework for 3D Gaze Estimation Using Appearance Cues from a Facial Video |
title_fullStr | FreeGaze: A Framework for 3D Gaze Estimation Using Appearance Cues from a Facial Video |
title_full_unstemmed | FreeGaze: A Framework for 3D Gaze Estimation Using Appearance Cues from a Facial Video |
title_short | FreeGaze: A Framework for 3D Gaze Estimation Using Appearance Cues from a Facial Video |
title_sort | freegaze a framework for 3d gaze estimation using appearance cues from a facial video |
topic | gaze estimation dual-branch CNN improved normalization eye features face features |
url | https://www.mdpi.com/1424-8220/23/23/9604 |
work_keys_str_mv | AT shangtian freegazeaframeworkfor3dgazeestimationusingappearancecuesfromafacialvideo AT haiyantu freegazeaframeworkfor3dgazeestimationusingappearancecuesfromafacialvideo AT linghe freegazeaframeworkfor3dgazeestimationusingappearancecuesfromafacialvideo AT yueivanwu freegazeaframeworkfor3dgazeestimationusingappearancecuesfromafacialvideo AT xiujuanzheng freegazeaframeworkfor3dgazeestimationusingappearancecuesfromafacialvideo |