Enhancing Human Pose Estimation with Privileged Learning

Transformer architecture shows significant improvements in different applications, such as Natural Language Processing, Computer Vision and even Graph Machine Learning. Recent advances in the Human Pose Estimation (HPE) show that Vision Transformers are a great choice for this problem as well. But e...

Full description

Bibliographic Details
Main Authors: Alexander Marusov, Mariam Kaprielova, Radoslav Neychev
Format: Article
Language:English
Published: FRUCT 2022-04-01
Series:Proceedings of the XXth Conference of Open Innovations Association FRUCT
Subjects:
Online Access:https://www.fruct.org/publications/fruct31/files/Mar2.pdf
Description
Summary:Transformer architecture shows significant improvements in different applications, such as Natural Language Processing, Computer Vision and even Graph Machine Learning. Recent advances in the Human Pose Estimation (HPE) show that Vision Transformers are a great choice for this problem as well. But even state of the art architectures require additional enhancements to the training process to achieve the best results. In this paper we propose the privileged learning approach to HPE by incorporating the information about body proportions into the training pipeline. We quantitatively and qualitatively evaluate our method on the standard benchmark dataset Human3.6M. The proposed method shows stable improvements using the same model architecture.
ISSN:2305-7254
2343-0737