Personalizing human video pose estimation

<p>We propose a personalized ConvNet pose estimator that automatically adapts itself to the uniqueness of a person’s appearance to improve pose estimation in long videos</p> <br/> <p>We make the following contributions: (i) we show that given a few high-precision pose annotat...

Ful tanımlama

Detaylı Bibliyografya
Asıl Yazarlar: Charles, J, Pfister, T, Maggee, D, Hogg, D, Zisserman, A
Materyal Türü: Conference item
Baskı/Yayın Bilgisi: Institute of Electrical and Electronics Engineers 2016
_version_ 1826257262465777664
author Charles, J
Pfister, T
Maggee, D
Hogg, D
Zisserman, A
author_facet Charles, J
Pfister, T
Maggee, D
Hogg, D
Zisserman, A
author_sort Charles, J
collection OXFORD
description <p>We propose a personalized ConvNet pose estimator that automatically adapts itself to the uniqueness of a person’s appearance to improve pose estimation in long videos</p> <br/> <p>We make the following contributions: (i) we show that given a few high-precision pose annotations, e.g. from a generic ConvNet pose estimator, additional annotations can be generated throughout the video using a combination of image-based matching for temporally distant frames, and dense optical flow for temporally local frames; (ii) we develop an occlusion aware self-evaluation model that is able to automatically select the high-quality and reject the erroneous additional annotations; and (iii) we demonstrate that these high-quality annotations can be used to fine-tune a ConvNet pose estimator and thereby personalize it to lock on to key discriminative features of the person’s appearance. The outcome is a substantial improvement in the pose estimates for the target video using the personalized ConvNet compared to the original generic ConvNet.</p> <br/> <p>Our method outperforms the state of the art (including top ConvNet methods) by a large margin on three standard benchmarks, as well as on a new challenging YouTube video dataset. Furthermore, we show that training from the automatically generated annotations can be used to improve the performance of a generic ConvNet on other benchmarks.</p>
first_indexed 2024-03-06T18:15:23Z
format Conference item
id oxford-uuid:046eef2e-1918-4bb7-ac4e-3bda9ee7f90e
institution University of Oxford
last_indexed 2024-03-06T18:15:23Z
publishDate 2016
publisher Institute of Electrical and Electronics Engineers
record_format dspace
spelling oxford-uuid:046eef2e-1918-4bb7-ac4e-3bda9ee7f90e2022-03-26T08:51:48ZPersonalizing human video pose estimationConference itemhttp://purl.org/coar/resource_type/c_5794uuid:046eef2e-1918-4bb7-ac4e-3bda9ee7f90eSymplectic Elements at OxfordInstitute of Electrical and Electronics Engineers2016Charles, JPfister, TMaggee, DHogg, DZisserman, A<p>We propose a personalized ConvNet pose estimator that automatically adapts itself to the uniqueness of a person’s appearance to improve pose estimation in long videos</p> <br/> <p>We make the following contributions: (i) we show that given a few high-precision pose annotations, e.g. from a generic ConvNet pose estimator, additional annotations can be generated throughout the video using a combination of image-based matching for temporally distant frames, and dense optical flow for temporally local frames; (ii) we develop an occlusion aware self-evaluation model that is able to automatically select the high-quality and reject the erroneous additional annotations; and (iii) we demonstrate that these high-quality annotations can be used to fine-tune a ConvNet pose estimator and thereby personalize it to lock on to key discriminative features of the person’s appearance. The outcome is a substantial improvement in the pose estimates for the target video using the personalized ConvNet compared to the original generic ConvNet.</p> <br/> <p>Our method outperforms the state of the art (including top ConvNet methods) by a large margin on three standard benchmarks, as well as on a new challenging YouTube video dataset. Furthermore, we show that training from the automatically generated annotations can be used to improve the performance of a generic ConvNet on other benchmarks.</p>
spellingShingle Charles, J
Pfister, T
Maggee, D
Hogg, D
Zisserman, A
Personalizing human video pose estimation
title Personalizing human video pose estimation
title_full Personalizing human video pose estimation
title_fullStr Personalizing human video pose estimation
title_full_unstemmed Personalizing human video pose estimation
title_short Personalizing human video pose estimation
title_sort personalizing human video pose estimation
work_keys_str_mv AT charlesj personalizinghumanvideoposeestimation
AT pfistert personalizinghumanvideoposeestimation
AT maggeed personalizinghumanvideoposeestimation
AT hoggd personalizinghumanvideoposeestimation
AT zissermana personalizinghumanvideoposeestimation