Video retrieval by mimicking poses

We describe a method for real time video retrieval where the task is to match the 2D human pose of a query. A user can form a query by (i) interactively controlling a stickman on a web based GUI, (ii) uploading an image of the desired pose, or (iii) using the Kinect and acting out the query himself....

Full description

Bibliographic Details
Main Authors: Jammalamadaka, N, Zisserman, A, Eichner, M, Ferrari, V, Jawahar, CV
Format: Conference item
Language:English
Published: Association for Computing Machinery 2012
_version_ 1817932410686275584
author Jammalamadaka, N
Zisserman, A
Eichner, M
Ferrari, V
Jawahar, CV
author_facet Jammalamadaka, N
Zisserman, A
Eichner, M
Ferrari, V
Jawahar, CV
author_sort Jammalamadaka, N
collection OXFORD
description We describe a method for real time video retrieval where the task is to match the 2D human pose of a query. A user can form a query by (i) interactively controlling a stickman on a web based GUI, (ii) uploading an image of the desired pose, or (iii) using the Kinect and acting out the query himself. The method is scalable and is applied to a dataset of 18 films totaling more than three million frames. The real time performance is achieved by searching for approximate nearest neighbors to the query using a random forest of K-D trees. Apart from the query modalities, we introduce two other areas of novelty. First, we show that pose retrieval can proceed using a low dimensional representation. Second, we show that the precision of the results can be improved substantially by combining the outputs of independent human pose estimation algorithms. The performance of the system is assessed quantitatively over a range of pose queries.
first_indexed 2024-12-09T03:37:29Z
format Conference item
id oxford-uuid:8373677c-60cf-4bae-aa07-c0fb3a258398
institution University of Oxford
language English
last_indexed 2024-12-09T03:37:29Z
publishDate 2012
publisher Association for Computing Machinery
record_format dspace
spelling oxford-uuid:8373677c-60cf-4bae-aa07-c0fb3a2583982024-12-03T14:36:02ZVideo retrieval by mimicking posesConference itemhttp://purl.org/coar/resource_type/c_5794uuid:8373677c-60cf-4bae-aa07-c0fb3a258398EnglishSymplectic ElementsAssociation for Computing Machinery2012Jammalamadaka, NZisserman, AEichner, MFerrari, VJawahar, CVWe describe a method for real time video retrieval where the task is to match the 2D human pose of a query. A user can form a query by (i) interactively controlling a stickman on a web based GUI, (ii) uploading an image of the desired pose, or (iii) using the Kinect and acting out the query himself. The method is scalable and is applied to a dataset of 18 films totaling more than three million frames. The real time performance is achieved by searching for approximate nearest neighbors to the query using a random forest of K-D trees. Apart from the query modalities, we introduce two other areas of novelty. First, we show that pose retrieval can proceed using a low dimensional representation. Second, we show that the precision of the results can be improved substantially by combining the outputs of independent human pose estimation algorithms. The performance of the system is assessed quantitatively over a range of pose queries.
spellingShingle Jammalamadaka, N
Zisserman, A
Eichner, M
Ferrari, V
Jawahar, CV
Video retrieval by mimicking poses
title Video retrieval by mimicking poses
title_full Video retrieval by mimicking poses
title_fullStr Video retrieval by mimicking poses
title_full_unstemmed Video retrieval by mimicking poses
title_short Video retrieval by mimicking poses
title_sort video retrieval by mimicking poses
work_keys_str_mv AT jammalamadakan videoretrievalbymimickingposes
AT zissermana videoretrievalbymimickingposes
AT eichnerm videoretrievalbymimickingposes
AT ferrariv videoretrievalbymimickingposes
AT jawaharcv videoretrievalbymimickingposes