Deep convolutional neural networks for efficient pose estimation in gesture videos

Our objective is to efficiently and accurately estimate the upper body pose of humans in gesture videos. To this end, we build on the recent successful applications of deep convolutional neural networks (ConvNets). Our novelties are: (i) our method is the first to our knowledge to use ConvNets for e...

Full description

Bibliographic Details
Main Authors: Pfister, T, Simonyan, K, Charles, J, Zisserman, A
Format: Conference item
Language:English
Published: Springer 2015
_version_ 1817931522255093760
author Pfister, T
Simonyan, K
Charles, J
Zisserman, A
author_facet Pfister, T
Simonyan, K
Charles, J
Zisserman, A
author_sort Pfister, T
collection OXFORD
description Our objective is to efficiently and accurately estimate the upper body pose of humans in gesture videos. To this end, we build on the recent successful applications of deep convolutional neural networks (ConvNets). Our novelties are: (i) our method is the first to our knowledge to use ConvNets for estimating human pose in videos; (ii) a new network that exploits temporal information from multiple frames, leading to better performance; (iii) showing that pre-segmenting the foreground of the video improves performance; and (iv) demonstrating that even without foreground segmentations, the network learns to abstract away from the background and can estimate the pose even in the presence of a complex, varying background.
first_indexed 2024-12-09T03:23:21Z
format Conference item
id oxford-uuid:05ebcea8-1ba4-49a0-82fb-f37cfb75c6e3
institution University of Oxford
language English
last_indexed 2024-12-09T03:23:21Z
publishDate 2015
publisher Springer
record_format dspace
spelling oxford-uuid:05ebcea8-1ba4-49a0-82fb-f37cfb75c6e32024-11-26T13:28:30ZDeep convolutional neural networks for efficient pose estimation in gesture videosConference itemhttp://purl.org/coar/resource_type/c_5794uuid:05ebcea8-1ba4-49a0-82fb-f37cfb75c6e3EnglishSymplectic ElementsSpringer2015Pfister, TSimonyan, KCharles, JZisserman, AOur objective is to efficiently and accurately estimate the upper body pose of humans in gesture videos. To this end, we build on the recent successful applications of deep convolutional neural networks (ConvNets). Our novelties are: (i) our method is the first to our knowledge to use ConvNets for estimating human pose in videos; (ii) a new network that exploits temporal information from multiple frames, leading to better performance; (iii) showing that pre-segmenting the foreground of the video improves performance; and (iv) demonstrating that even without foreground segmentations, the network learns to abstract away from the background and can estimate the pose even in the presence of a complex, varying background.
spellingShingle Pfister, T
Simonyan, K
Charles, J
Zisserman, A
Deep convolutional neural networks for efficient pose estimation in gesture videos
title Deep convolutional neural networks for efficient pose estimation in gesture videos
title_full Deep convolutional neural networks for efficient pose estimation in gesture videos
title_fullStr Deep convolutional neural networks for efficient pose estimation in gesture videos
title_full_unstemmed Deep convolutional neural networks for efficient pose estimation in gesture videos
title_short Deep convolutional neural networks for efficient pose estimation in gesture videos
title_sort deep convolutional neural networks for efficient pose estimation in gesture videos
work_keys_str_mv AT pfistert deepconvolutionalneuralnetworksforefficientposeestimationingesturevideos
AT simonyank deepconvolutionalneuralnetworksforefficientposeestimationingesturevideos
AT charlesj deepconvolutionalneuralnetworksforefficientposeestimationingesturevideos
AT zissermana deepconvolutionalneuralnetworksforefficientposeestimationingesturevideos