Meta-Learning, Fast Adaptation, and Latent Representation for Head Pose Estimation

Head pose estimation is used in a variety of human computer interface applications, like stare tracking, driving assistance, impaired assistance, and entertainment. Advances in convolutional neural networks have a considerable improvement in the performance of head pose estimation. However, difficul...

Full description

Bibliographic Details
Main Authors: Manoj Joshi, Dibakar Raj Pant, Rupesh Raj Karn, Jukka Heikkonen, Rajeev Kanth
Format: Article
Language:English
Published: FRUCT 2022-04-01
Series:Proceedings of the XXth Conference of Open Innovations Association FRUCT
Subjects:
Online Access:https://www.fruct.org/publications/fruct31/files/Jos.pdf
_version_ 1818231636059226112
author Manoj Joshi
Dibakar Raj Pant
Rupesh Raj Karn
Jukka Heikkonen
Rajeev Kanth
author_facet Manoj Joshi
Dibakar Raj Pant
Rupesh Raj Karn
Jukka Heikkonen
Rajeev Kanth
author_sort Manoj Joshi
collection DOAJ
description Head pose estimation is used in a variety of human computer interface applications, like stare tracking, driving assistance, impaired assistance, and entertainment. Advances in convolutional neural networks have a considerable improvement in the performance of head pose estimation. However, difficulties in capturing well-labelled head pose data and differences in the facial features of different persons make them difficult to use. This work proposes a meta-learning based technique for head pose estimation problem in BIWI head pose dataset. An approach to learning latent representation of head pose features using variational autoencoder is implemented. Then a fast, adaptable head pose estimator is trained using meta-learning in a few-shot settings. Model agnostic meta-learning (MAML) algorithm has been deployed for training a head pose estimator. Mean Average Error (MAEavg) of 7.33 is achieved in predicting head pose angles in one-shot settings. After meta-training, the optimized model is used to analyze fast adaptation in a test set that has been separated from the BIWI head pose dataset. We begin with the trained networks optimum parameters and optimize the inner loop for quick adaptation. The optimized model can predict accurate head poses using as few as 10 gradient descent steps in the unseen set of tasks sampled from the test set.
first_indexed 2024-12-12T10:53:32Z
format Article
id doaj.art-d9f7bfa58d10458b974fdc09020389ea
institution Directory Open Access Journal
issn 2305-7254
2343-0737
language English
last_indexed 2024-12-12T10:53:32Z
publishDate 2022-04-01
publisher FRUCT
record_format Article
series Proceedings of the XXth Conference of Open Innovations Association FRUCT
spelling doaj.art-d9f7bfa58d10458b974fdc09020389ea2022-12-22T00:26:43ZengFRUCTProceedings of the XXth Conference of Open Innovations Association FRUCT2305-72542343-07372022-04-01311717810.23919/FRUCT54823.2022.9770932Meta-Learning, Fast Adaptation, and Latent Representation for Head Pose EstimationManoj Joshi0Dibakar Raj Pant1Rupesh Raj Karn2Jukka Heikkonen3Rajeev Kanth4Institute of Engineering / Nepal, NepalInstitute of Engineering / Nepal, NepalKhalifa University, United Arab EmiratesUniversity of Turku, FinlandSavonia University of Applied Sciences / Kuopio, FinlandHead pose estimation is used in a variety of human computer interface applications, like stare tracking, driving assistance, impaired assistance, and entertainment. Advances in convolutional neural networks have a considerable improvement in the performance of head pose estimation. However, difficulties in capturing well-labelled head pose data and differences in the facial features of different persons make them difficult to use. This work proposes a meta-learning based technique for head pose estimation problem in BIWI head pose dataset. An approach to learning latent representation of head pose features using variational autoencoder is implemented. Then a fast, adaptable head pose estimator is trained using meta-learning in a few-shot settings. Model agnostic meta-learning (MAML) algorithm has been deployed for training a head pose estimator. Mean Average Error (MAEavg) of 7.33 is achieved in predicting head pose angles in one-shot settings. After meta-training, the optimized model is used to analyze fast adaptation in a test set that has been separated from the BIWI head pose dataset. We begin with the trained networks optimum parameters and optimize the inner loop for quick adaptation. The optimized model can predict accurate head poses using as few as 10 gradient descent steps in the unseen set of tasks sampled from the test set.https://www.fruct.org/publications/fruct31/files/Jos.pdfhead pose estimationmeta-learningdeep learningrepresentation learningfew-shot learning
spellingShingle Manoj Joshi
Dibakar Raj Pant
Rupesh Raj Karn
Jukka Heikkonen
Rajeev Kanth
Meta-Learning, Fast Adaptation, and Latent Representation for Head Pose Estimation
Proceedings of the XXth Conference of Open Innovations Association FRUCT
head pose estimation
meta-learning
deep learning
representation learning
few-shot learning
title Meta-Learning, Fast Adaptation, and Latent Representation for Head Pose Estimation
title_full Meta-Learning, Fast Adaptation, and Latent Representation for Head Pose Estimation
title_fullStr Meta-Learning, Fast Adaptation, and Latent Representation for Head Pose Estimation
title_full_unstemmed Meta-Learning, Fast Adaptation, and Latent Representation for Head Pose Estimation
title_short Meta-Learning, Fast Adaptation, and Latent Representation for Head Pose Estimation
title_sort meta learning fast adaptation and latent representation for head pose estimation
topic head pose estimation
meta-learning
deep learning
representation learning
few-shot learning
url https://www.fruct.org/publications/fruct31/files/Jos.pdf
work_keys_str_mv AT manojjoshi metalearningfastadaptationandlatentrepresentationforheadposeestimation
AT dibakarrajpant metalearningfastadaptationandlatentrepresentationforheadposeestimation
AT rupeshrajkarn metalearningfastadaptationandlatentrepresentationforheadposeestimation
AT jukkaheikkonen metalearningfastadaptationandlatentrepresentationforheadposeestimation
AT rajeevkanth metalearningfastadaptationandlatentrepresentationforheadposeestimation