Action Recognition by Hierarchical Sequence Summarization
Recent progress has shown that learning from hierarchical feature representations leads to improvements in various computer vision tasks. Motivated by the observation that human activity data contains information at various temporal resolutions, we present a hierarchical sequence summarization appro...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
2014
|
Online Access: | http://hdl.handle.net/1721.1/86123 https://orcid.org/0000-0001-5232-7281 |
_version_ | 1826207839291441152 |
---|---|
author | Song, Yale Morency, Louis-Philippe Davis, Randall |
author2 | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory |
author_facet | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Song, Yale Morency, Louis-Philippe Davis, Randall |
author_sort | Song, Yale |
collection | MIT |
description | Recent progress has shown that learning from hierarchical feature representations leads to improvements in various computer vision tasks. Motivated by the observation that human activity data contains information at various temporal resolutions, we present a hierarchical sequence summarization approach for action recognition that learns multiple layers of discriminative feature representations at different temporal granularities. We build up a hierarchy dynamically and recursively by alternating sequence learning and sequence summarization. For sequence learning we use CRFs with latent variables to learn hidden spatio-temporal dynamics, for sequence summarization we group observations that have similar semantic meaning in the latent space. For each layer we learn an abstract feature representation through non-linear gate functions. This procedure is repeated to obtain a hierarchical sequence summary representation. We develop an efficient learning method to train our model and show that its complexity grows sub linearly with the size of the hierarchy. Experimental results show the effectiveness of our approach, achieving the best published results on the Arm Gesture and Canal9 datasets. |
first_indexed | 2024-09-23T13:55:46Z |
format | Article |
id | mit-1721.1/86123 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T13:55:46Z |
publishDate | 2014 |
record_format | dspace |
spelling | mit-1721.1/861232022-10-01T18:02:55Z Action Recognition by Hierarchical Sequence Summarization Song, Yale Morency, Louis-Philippe Davis, Randall Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Song, Yale Davis, Randall Recent progress has shown that learning from hierarchical feature representations leads to improvements in various computer vision tasks. Motivated by the observation that human activity data contains information at various temporal resolutions, we present a hierarchical sequence summarization approach for action recognition that learns multiple layers of discriminative feature representations at different temporal granularities. We build up a hierarchy dynamically and recursively by alternating sequence learning and sequence summarization. For sequence learning we use CRFs with latent variables to learn hidden spatio-temporal dynamics, for sequence summarization we group observations that have similar semantic meaning in the latent space. For each layer we learn an abstract feature representation through non-linear gate functions. This procedure is repeated to obtain a hierarchical sequence summary representation. We develop an efficient learning method to train our model and show that its complexity grows sub linearly with the size of the hierarchy. Experimental results show the effectiveness of our approach, achieving the best published results on the Arm Gesture and Canal9 datasets. United States. Office of Naval Research (N000140910625) National Science Foundation (U.S.) (IIS-1018055) United States. Army Research, Development, and Engineering Command 2014-04-11T18:42:47Z 2014-04-11T18:42:47Z 2013-06 Article http://purl.org/eprint/type/ConferencePaper 978-0-7695-4989-7 http://hdl.handle.net/1721.1/86123 Song, Yale, Louis-Philippe Morency, and Randall Davis. “Action Recognition by Hierarchical Sequence Summarization.” 2013 IEEE Conference on Computer Vision and Pattern Recognition (n.d.). https://orcid.org/0000-0001-5232-7281 en_US http://dx.doi.org/10.1109/CVPR.2013.457 Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf MIT web domain |
spellingShingle | Song, Yale Morency, Louis-Philippe Davis, Randall Action Recognition by Hierarchical Sequence Summarization |
title | Action Recognition by Hierarchical Sequence Summarization |
title_full | Action Recognition by Hierarchical Sequence Summarization |
title_fullStr | Action Recognition by Hierarchical Sequence Summarization |
title_full_unstemmed | Action Recognition by Hierarchical Sequence Summarization |
title_short | Action Recognition by Hierarchical Sequence Summarization |
title_sort | action recognition by hierarchical sequence summarization |
url | http://hdl.handle.net/1721.1/86123 https://orcid.org/0000-0001-5232-7281 |
work_keys_str_mv | AT songyale actionrecognitionbyhierarchicalsequencesummarization AT morencylouisphilippe actionrecognitionbyhierarchicalsequencesummarization AT davisrandall actionrecognitionbyhierarchicalsequencesummarization |