Dual‐view 3D human pose estimation without camera parameters for action recognition

Abstract The purpose of 3D human pose estimation is to estimate the 3D coordinates of key points of the human body directly from images. Although multi‐view based methods have better performance and higher precision of coordinate estimation than a single‐view based, they need to know the camera para...

Full description

Bibliographic Details
Main Authors:	Long Liu, Le Yang, Wanjun Chen, Xin Gao
Format:	Article
Language:	English
Published:	Wiley 2021-12-01
Series:	IET Image Processing
Subjects:	Image recognition Image sensors Computer vision and image processing techniques Regression analysis Machine learning (artificial intelligence)
Online Access:	https://doi.org/10.1049/ipr2.12277

_version_	1828096350404739072
author	Long Liu Le Yang Wanjun Chen Xin Gao
author_facet	Long Liu Le Yang Wanjun Chen Xin Gao
author_sort	Long Liu
collection	DOAJ
description	Abstract The purpose of 3D human pose estimation is to estimate the 3D coordinates of key points of the human body directly from images. Although multi‐view based methods have better performance and higher precision of coordinate estimation than a single‐view based, they need to know the camera parameters. In order to effectively avoid the restriction of this constraint and improve the generalizability of the model, a dual‐view single‐person 3D pose estimation method without camera parameters is proposed. This method first uses the 2D pose estimation network HR‐net to estimate the 2D joint point coordinates from two images with different views, and then inputs them into the 3D regression network to generate the final 3D joint point coordinates. In order to make the 3D regression network fully learn the spatial structure relationship of the human body and the transformation projection relationship between different views, a self‐supervised training method is designed based on a 3D human pose orthogonal projection model to generate the virtual views. In the pose estimation experiments on the Human3.6 dataset, this method achieves a significantly improved estimation error of 34.5 mm. Furthermore, an action recognition based on the human poses extracted by the proposed method is conducted, and an accuracy of 83.19% is obtained.
first_indexed	2024-04-11T07:28:58Z
format	Article
id	doaj.art-b1a281c8a03a41f4b1a5c3d70f0f5386
institution	Directory Open Access Journal
issn	1751-9659 1751-9667
language	English
last_indexed	2024-04-11T07:28:58Z
publishDate	2021-12-01
publisher	Wiley
record_format	Article
series	IET Image Processing
spelling	doaj.art-b1a281c8a03a41f4b1a5c3d70f0f53862022-12-22T04:36:59ZengWileyIET Image Processing1751-96591751-96672021-12-0115143433344010.1049/ipr2.12277Dual‐view 3D human pose estimation without camera parameters for action recognitionLong Liu0Le Yang1Wanjun Chen2Xin Gao3School of Automation and Information Engineering Xi'an University of Technology Xi'an Shaanxi ChinaSchool of Automation and Information Engineering Xi'an University of Technology Xi'an Shaanxi ChinaDepartment of Information Science Xi'an University of Technology Xi'an Shaanxi ChinaSchool of Automation and Information Engineering Xi'an University of Technology Xi'an Shaanxi ChinaAbstract The purpose of 3D human pose estimation is to estimate the 3D coordinates of key points of the human body directly from images. Although multi‐view based methods have better performance and higher precision of coordinate estimation than a single‐view based, they need to know the camera parameters. In order to effectively avoid the restriction of this constraint and improve the generalizability of the model, a dual‐view single‐person 3D pose estimation method without camera parameters is proposed. This method first uses the 2D pose estimation network HR‐net to estimate the 2D joint point coordinates from two images with different views, and then inputs them into the 3D regression network to generate the final 3D joint point coordinates. In order to make the 3D regression network fully learn the spatial structure relationship of the human body and the transformation projection relationship between different views, a self‐supervised training method is designed based on a 3D human pose orthogonal projection model to generate the virtual views. In the pose estimation experiments on the Human3.6 dataset, this method achieves a significantly improved estimation error of 34.5 mm. Furthermore, an action recognition based on the human poses extracted by the proposed method is conducted, and an accuracy of 83.19% is obtained.https://doi.org/10.1049/ipr2.12277Image recognitionImage sensorsComputer vision and image processing techniquesRegression analysisRegression analysisMachine learning (artificial intelligence)
spellingShingle	Long Liu Le Yang Wanjun Chen Xin Gao Dual‐view 3D human pose estimation without camera parameters for action recognition IET Image Processing Image recognition Image sensors Computer vision and image processing techniques Regression analysis Regression analysis Machine learning (artificial intelligence)
title	Dual‐view 3D human pose estimation without camera parameters for action recognition
title_full	Dual‐view 3D human pose estimation without camera parameters for action recognition
title_fullStr	Dual‐view 3D human pose estimation without camera parameters for action recognition
title_full_unstemmed	Dual‐view 3D human pose estimation without camera parameters for action recognition
title_short	Dual‐view 3D human pose estimation without camera parameters for action recognition
title_sort	dual view 3d human pose estimation without camera parameters for action recognition
topic	Image recognition Image sensors Computer vision and image processing techniques Regression analysis Regression analysis Machine learning (artificial intelligence)
url	https://doi.org/10.1049/ipr2.12277
work_keys_str_mv	AT longliu dualview3dhumanposeestimationwithoutcameraparametersforactionrecognition AT leyang dualview3dhumanposeestimationwithoutcameraparametersforactionrecognition AT wanjunchen dualview3dhumanposeestimationwithoutcameraparametersforactionrecognition AT xingao dualview3dhumanposeestimationwithoutcameraparametersforactionrecognition

Dual‐view 3D human pose estimation without camera parameters for action recognition

Similar Items