Capturing Complex 3D Human Motions with Kernelized Low-Rank Representation from Monocular RGB Camera

Recovering 3D structures from the monocular image sequence is an inherently ambiguous problem that has attracted considerable attention from several research communities. To resolve the ambiguities, a variety of additional priors, such as low-rank shape basis, have been proposed. In this paper, we m...

Full description

Bibliographic Details
Main Authors:	Xuan Wang, Fei Wang, Yanan Chen
Format:	Article
Language:	English
Published:	MDPI AG 2017-09-01
Series:	Sensors
Subjects:	3D human pose estimation monocular reconstruction non-rigid structure from motion kernel low-rank representation
Online Access:	https://www.mdpi.com/1424-8220/17/9/2019

_version_	1828152038816481280
author	Xuan Wang Fei Wang Yanan Chen
author_facet	Xuan Wang Fei Wang Yanan Chen
author_sort	Xuan Wang
collection	DOAJ
description	Recovering 3D structures from the monocular image sequence is an inherently ambiguous problem that has attracted considerable attention from several research communities. To resolve the ambiguities, a variety of additional priors, such as low-rank shape basis, have been proposed. In this paper, we make two contributions. First, we introduce an assumption that 3D structures lie on the union of nonlinear subspaces. Based on this assumption, we propose a Non-Rigid Structure from Motion (NRSfM) method with kernelized low-rank representation. To be specific, we utilize the soft-inextensibility constraint to accurately recover 3D human motions. Second, we extend this NRSfM method to the marker-less 3D human pose estimation problem by combining with Convolutional Neural Network (CNN) based 2D human joint detectors. To evaluate the performance of our methods, we apply our marker-based method on several sequences from Utrecht Multi-Person Motion (UMPM) benchmark and CMU MoCap datasets, and then apply the marker-less method on the Human3.6M datasets. The experiments demonstrate that the kernelized low-rank representation is more suitable for modeling the complex deformation and the method consequently yields more accurate reconstructions. Benefiting from the CNN-based detector, the marker-less approach can be applied to more real-life applications.
first_indexed	2024-04-11T22:06:03Z
format	Article
id	doaj.art-6837fa6fc15f4e84b50b551100a84997
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-04-11T22:06:03Z
publishDate	2017-09-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-6837fa6fc15f4e84b50b551100a849972022-12-22T04:00:43ZengMDPI AGSensors1424-82202017-09-01179201910.3390/s17092019s17092019Capturing Complex 3D Human Motions with Kernelized Low-Rank Representation from Monocular RGB CameraXuan Wang0Fei Wang1Yanan Chen2The Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, No.28 Xianning West Road, Xi’an 710048, ChinaThe Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, No.28 Xianning West Road, Xi’an 710048, ChinaThe Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, No.28 Xianning West Road, Xi’an 710048, ChinaRecovering 3D structures from the monocular image sequence is an inherently ambiguous problem that has attracted considerable attention from several research communities. To resolve the ambiguities, a variety of additional priors, such as low-rank shape basis, have been proposed. In this paper, we make two contributions. First, we introduce an assumption that 3D structures lie on the union of nonlinear subspaces. Based on this assumption, we propose a Non-Rigid Structure from Motion (NRSfM) method with kernelized low-rank representation. To be specific, we utilize the soft-inextensibility constraint to accurately recover 3D human motions. Second, we extend this NRSfM method to the marker-less 3D human pose estimation problem by combining with Convolutional Neural Network (CNN) based 2D human joint detectors. To evaluate the performance of our methods, we apply our marker-based method on several sequences from Utrecht Multi-Person Motion (UMPM) benchmark and CMU MoCap datasets, and then apply the marker-less method on the Human3.6M datasets. The experiments demonstrate that the kernelized low-rank representation is more suitable for modeling the complex deformation and the method consequently yields more accurate reconstructions. Benefiting from the CNN-based detector, the marker-less approach can be applied to more real-life applications.https://www.mdpi.com/1424-8220/17/9/20193D human pose estimationmonocular reconstructionnon-rigid structure from motionkernel low-rank representation
spellingShingle	Xuan Wang Fei Wang Yanan Chen Capturing Complex 3D Human Motions with Kernelized Low-Rank Representation from Monocular RGB Camera Sensors 3D human pose estimation monocular reconstruction non-rigid structure from motion kernel low-rank representation
title	Capturing Complex 3D Human Motions with Kernelized Low-Rank Representation from Monocular RGB Camera
title_full	Capturing Complex 3D Human Motions with Kernelized Low-Rank Representation from Monocular RGB Camera
title_fullStr	Capturing Complex 3D Human Motions with Kernelized Low-Rank Representation from Monocular RGB Camera
title_full_unstemmed	Capturing Complex 3D Human Motions with Kernelized Low-Rank Representation from Monocular RGB Camera
title_short	Capturing Complex 3D Human Motions with Kernelized Low-Rank Representation from Monocular RGB Camera
title_sort	capturing complex 3d human motions with kernelized low rank representation from monocular rgb camera
topic	3D human pose estimation monocular reconstruction non-rigid structure from motion kernel low-rank representation
url	https://www.mdpi.com/1424-8220/17/9/2019
work_keys_str_mv	AT xuanwang capturingcomplex3dhumanmotionswithkernelizedlowrankrepresentationfrommonocularrgbcamera AT feiwang capturingcomplex3dhumanmotionswithkernelizedlowrankrepresentationfrommonocularrgbcamera AT yananchen capturingcomplex3dhumanmotionswithkernelizedlowrankrepresentationfrommonocularrgbcamera

Capturing Complex 3D Human Motions with Kernelized Low-Rank Representation from Monocular RGB Camera

Similar Items