FreeGaze: A Framework for 3D Gaze Estimation Using Appearance Cues from a Facial Video

Gaze is a significant behavioral characteristic that can be used to reflect a person’s attention. In recent years, there has been a growing interest in estimating gaze from facial videos. However, gaze estimation remains a challenging problem due to variations in appearance and head poses. To addres...

Full description

Bibliographic Details
Main Authors: Shang Tian, Haiyan Tu, Ling He, Yue Ivan Wu, Xiujuan Zheng
Format: Article
Language:English
Published: MDPI AG 2023-12-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/23/23/9604
_version_ 1827591945524871168
author Shang Tian
Haiyan Tu
Ling He
Yue Ivan Wu
Xiujuan Zheng
author_facet Shang Tian
Haiyan Tu
Ling He
Yue Ivan Wu
Xiujuan Zheng
author_sort Shang Tian
collection DOAJ
description Gaze is a significant behavioral characteristic that can be used to reflect a person’s attention. In recent years, there has been a growing interest in estimating gaze from facial videos. However, gaze estimation remains a challenging problem due to variations in appearance and head poses. To address this, a framework for 3D gaze estimation using appearance cues is developed in this study. The framework begins with an end-to-end approach to detect facial landmarks. Subsequently, we employ a normalization method and improve the normalization method using orthogonal matrices and conduct comparative experiments to prove that the improved normalization method has a higher accuracy and a lower computational time in gaze estimation. Finally, we introduce a dual-branch convolutional neural network, named FG-Net, which processes the normalized images and extracts eye and face features through two branches. The extracted multi-features are then integrated and input into a fully connected layer to estimate the 3D gaze vectors. To evaluate the performance of our approach, we conduct ten-fold cross-validation experiments on two public datasets, namely MPIIGaze and EyeDiap, achieving remarkable accuracies of 3.11° and 2.75°, respectively. The results demonstrate the high effectiveness of our proposed framework, showcasing its state-of-the-art performance in 3D gaze estimation.
first_indexed 2024-03-09T01:41:48Z
format Article
id doaj.art-259982f297f2416e8b5df56e5d34f21c
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-09T01:41:48Z
publishDate 2023-12-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-259982f297f2416e8b5df56e5d34f21c2023-12-08T15:26:33ZengMDPI AGSensors1424-82202023-12-012323960410.3390/s23239604FreeGaze: A Framework for 3D Gaze Estimation Using Appearance Cues from a Facial VideoShang Tian0Haiyan Tu1Ling He2Yue Ivan Wu3Xiujuan Zheng4College of Electrical Engineering, Sichuan University, Chengdu 610065, ChinaCollege of Electrical Engineering, Sichuan University, Chengdu 610065, ChinaCollege of Biomedical Engineering, Sichuan University, Chengdu 610065, ChinaCollege of Computer Science, Sichuan University, Chengdu 610065, ChinaCollege of Electrical Engineering, Sichuan University, Chengdu 610065, ChinaGaze is a significant behavioral characteristic that can be used to reflect a person’s attention. In recent years, there has been a growing interest in estimating gaze from facial videos. However, gaze estimation remains a challenging problem due to variations in appearance and head poses. To address this, a framework for 3D gaze estimation using appearance cues is developed in this study. The framework begins with an end-to-end approach to detect facial landmarks. Subsequently, we employ a normalization method and improve the normalization method using orthogonal matrices and conduct comparative experiments to prove that the improved normalization method has a higher accuracy and a lower computational time in gaze estimation. Finally, we introduce a dual-branch convolutional neural network, named FG-Net, which processes the normalized images and extracts eye and face features through two branches. The extracted multi-features are then integrated and input into a fully connected layer to estimate the 3D gaze vectors. To evaluate the performance of our approach, we conduct ten-fold cross-validation experiments on two public datasets, namely MPIIGaze and EyeDiap, achieving remarkable accuracies of 3.11° and 2.75°, respectively. The results demonstrate the high effectiveness of our proposed framework, showcasing its state-of-the-art performance in 3D gaze estimation.https://www.mdpi.com/1424-8220/23/23/9604gaze estimationdual-branch CNNimproved normalizationeye featuresface features
spellingShingle Shang Tian
Haiyan Tu
Ling He
Yue Ivan Wu
Xiujuan Zheng
FreeGaze: A Framework for 3D Gaze Estimation Using Appearance Cues from a Facial Video
Sensors
gaze estimation
dual-branch CNN
improved normalization
eye features
face features
title FreeGaze: A Framework for 3D Gaze Estimation Using Appearance Cues from a Facial Video
title_full FreeGaze: A Framework for 3D Gaze Estimation Using Appearance Cues from a Facial Video
title_fullStr FreeGaze: A Framework for 3D Gaze Estimation Using Appearance Cues from a Facial Video
title_full_unstemmed FreeGaze: A Framework for 3D Gaze Estimation Using Appearance Cues from a Facial Video
title_short FreeGaze: A Framework for 3D Gaze Estimation Using Appearance Cues from a Facial Video
title_sort freegaze a framework for 3d gaze estimation using appearance cues from a facial video
topic gaze estimation
dual-branch CNN
improved normalization
eye features
face features
url https://www.mdpi.com/1424-8220/23/23/9604
work_keys_str_mv AT shangtian freegazeaframeworkfor3dgazeestimationusingappearancecuesfromafacialvideo
AT haiyantu freegazeaframeworkfor3dgazeestimationusingappearancecuesfromafacialvideo
AT linghe freegazeaframeworkfor3dgazeestimationusingappearancecuesfromafacialvideo
AT yueivanwu freegazeaframeworkfor3dgazeestimationusingappearancecuesfromafacialvideo
AT xiujuanzheng freegazeaframeworkfor3dgazeestimationusingappearancecuesfromafacialvideo