EM-Gaze: eye context correlation and metric learning for gaze estimation

Abstract In recent years, deep learning techniques have been used to estimate gaze—a significant task in computer vision and human-computer interaction. Previous studies have made significant achievements in predicting 2D or 3D gazes from monocular face images. This study presents a deep neural netw...

Full description

Bibliographic Details
Main Authors: Jinchao Zhou, Guoan Li, Feng Shi, Xiaoyan Guo, Pengfei Wan, Miao Wang
Format: Article
Language:English
Published: SpringerOpen 2023-05-01
Series:Visual Computing for Industry, Biomedicine, and Art
Subjects:
Online Access:https://doi.org/10.1186/s42492-023-00135-6
_version_ 1797832237449740288
author Jinchao Zhou
Guoan Li
Feng Shi
Xiaoyan Guo
Pengfei Wan
Miao Wang
author_facet Jinchao Zhou
Guoan Li
Feng Shi
Xiaoyan Guo
Pengfei Wan
Miao Wang
author_sort Jinchao Zhou
collection DOAJ
description Abstract In recent years, deep learning techniques have been used to estimate gaze—a significant task in computer vision and human-computer interaction. Previous studies have made significant achievements in predicting 2D or 3D gazes from monocular face images. This study presents a deep neural network for 2D gaze estimation on mobile devices. It achieves state-of-the-art 2D gaze point regression error, while significantly improving gaze classification error on quadrant divisions of the display. To this end, an efficient attention-based module that correlates and fuses the left and right eye contextual features is first proposed to improve gaze point regression performance. Subsequently, through a unified perspective for gaze estimation, metric learning for gaze classification on quadrant divisions is incorporated as additional supervision. Consequently, both gaze point regression and quadrant classification performances are improved. The experiments demonstrate that the proposed method outperforms existing gaze-estimation methods on the GazeCapture and MPIIFaceGaze datasets.
first_indexed 2024-04-09T14:05:39Z
format Article
id doaj.art-9758ebc127ce40d18acdbf4183a9fef2
institution Directory Open Access Journal
issn 2524-4442
language English
last_indexed 2024-04-09T14:05:39Z
publishDate 2023-05-01
publisher SpringerOpen
record_format Article
series Visual Computing for Industry, Biomedicine, and Art
spelling doaj.art-9758ebc127ce40d18acdbf4183a9fef22023-05-07T11:04:31ZengSpringerOpenVisual Computing for Industry, Biomedicine, and Art2524-44422023-05-016111210.1186/s42492-023-00135-6EM-Gaze: eye context correlation and metric learning for gaze estimationJinchao Zhou0Guoan Li1Feng Shi2Xiaoyan Guo3Pengfei Wan4Miao Wang5State Key Laboratory of Virtual Reality Technology and Systems, Beihang UniversityState Key Laboratory of Virtual Reality Technology and Systems, Beihang UniversityKuaishou TechnologyKuaishou TechnologyKuaishou TechnologyState Key Laboratory of Virtual Reality Technology and Systems, Beihang UniversityAbstract In recent years, deep learning techniques have been used to estimate gaze—a significant task in computer vision and human-computer interaction. Previous studies have made significant achievements in predicting 2D or 3D gazes from monocular face images. This study presents a deep neural network for 2D gaze estimation on mobile devices. It achieves state-of-the-art 2D gaze point regression error, while significantly improving gaze classification error on quadrant divisions of the display. To this end, an efficient attention-based module that correlates and fuses the left and right eye contextual features is first proposed to improve gaze point regression performance. Subsequently, through a unified perspective for gaze estimation, metric learning for gaze classification on quadrant divisions is incorporated as additional supervision. Consequently, both gaze point regression and quadrant classification performances are improved. The experiments demonstrate that the proposed method outperforms existing gaze-estimation methods on the GazeCapture and MPIIFaceGaze datasets.https://doi.org/10.1186/s42492-023-00135-6Computer visionGaze estimationMetric learningAttentionMulti-task learning
spellingShingle Jinchao Zhou
Guoan Li
Feng Shi
Xiaoyan Guo
Pengfei Wan
Miao Wang
EM-Gaze: eye context correlation and metric learning for gaze estimation
Visual Computing for Industry, Biomedicine, and Art
Computer vision
Gaze estimation
Metric learning
Attention
Multi-task learning
title EM-Gaze: eye context correlation and metric learning for gaze estimation
title_full EM-Gaze: eye context correlation and metric learning for gaze estimation
title_fullStr EM-Gaze: eye context correlation and metric learning for gaze estimation
title_full_unstemmed EM-Gaze: eye context correlation and metric learning for gaze estimation
title_short EM-Gaze: eye context correlation and metric learning for gaze estimation
title_sort em gaze eye context correlation and metric learning for gaze estimation
topic Computer vision
Gaze estimation
Metric learning
Attention
Multi-task learning
url https://doi.org/10.1186/s42492-023-00135-6
work_keys_str_mv AT jinchaozhou emgazeeyecontextcorrelationandmetriclearningforgazeestimation
AT guoanli emgazeeyecontextcorrelationandmetriclearningforgazeestimation
AT fengshi emgazeeyecontextcorrelationandmetriclearningforgazeestimation
AT xiaoyanguo emgazeeyecontextcorrelationandmetriclearningforgazeestimation
AT pengfeiwan emgazeeyecontextcorrelationandmetriclearningforgazeestimation
AT miaowang emgazeeyecontextcorrelationandmetriclearningforgazeestimation