Action Recognition by Hierarchical Sequence Summarization

Recent progress has shown that learning from hierarchical feature representations leads to improvements in various computer vision tasks. Motivated by the observation that human activity data contains information at various temporal resolutions, we present a hierarchical sequence summarization appro...

Full description

Bibliographic Details
Main Authors:	Song, Yale, Morency, Louis-Philippe, Davis, Randall
Other Authors:	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format:	Article
Language:	en_US
Published:	2014
Online Access:	http://hdl.handle.net/1721.1/86123 https://orcid.org/0000-0001-5232-7281

_version_	1826207839291441152
author	Song, Yale Morency, Louis-Philippe Davis, Randall
author2	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Song, Yale Morency, Louis-Philippe Davis, Randall
author_sort	Song, Yale
collection	MIT
description	Recent progress has shown that learning from hierarchical feature representations leads to improvements in various computer vision tasks. Motivated by the observation that human activity data contains information at various temporal resolutions, we present a hierarchical sequence summarization approach for action recognition that learns multiple layers of discriminative feature representations at different temporal granularities. We build up a hierarchy dynamically and recursively by alternating sequence learning and sequence summarization. For sequence learning we use CRFs with latent variables to learn hidden spatio-temporal dynamics, for sequence summarization we group observations that have similar semantic meaning in the latent space. For each layer we learn an abstract feature representation through non-linear gate functions. This procedure is repeated to obtain a hierarchical sequence summary representation. We develop an efficient learning method to train our model and show that its complexity grows sub linearly with the size of the hierarchy. Experimental results show the effectiveness of our approach, achieving the best published results on the Arm Gesture and Canal9 datasets.
first_indexed	2024-09-23T13:55:46Z
format	Article
id	mit-1721.1/86123
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2024-09-23T13:55:46Z
publishDate	2014
record_format	dspace
spelling	mit-1721.1/861232022-10-01T18:02:55Z Action Recognition by Hierarchical Sequence Summarization Song, Yale Morency, Louis-Philippe Davis, Randall Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Song, Yale Davis, Randall Recent progress has shown that learning from hierarchical feature representations leads to improvements in various computer vision tasks. Motivated by the observation that human activity data contains information at various temporal resolutions, we present a hierarchical sequence summarization approach for action recognition that learns multiple layers of discriminative feature representations at different temporal granularities. We build up a hierarchy dynamically and recursively by alternating sequence learning and sequence summarization. For sequence learning we use CRFs with latent variables to learn hidden spatio-temporal dynamics, for sequence summarization we group observations that have similar semantic meaning in the latent space. For each layer we learn an abstract feature representation through non-linear gate functions. This procedure is repeated to obtain a hierarchical sequence summary representation. We develop an efficient learning method to train our model and show that its complexity grows sub linearly with the size of the hierarchy. Experimental results show the effectiveness of our approach, achieving the best published results on the Arm Gesture and Canal9 datasets. United States. Office of Naval Research (N000140910625) National Science Foundation (U.S.) (IIS-1018055) United States. Army Research, Development, and Engineering Command 2014-04-11T18:42:47Z 2014-04-11T18:42:47Z 2013-06 Article http://purl.org/eprint/type/ConferencePaper 978-0-7695-4989-7 http://hdl.handle.net/1721.1/86123 Song, Yale, Louis-Philippe Morency, and Randall Davis. “Action Recognition by Hierarchical Sequence Summarization.” 2013 IEEE Conference on Computer Vision and Pattern Recognition (n.d.). https://orcid.org/0000-0001-5232-7281 en_US http://dx.doi.org/10.1109/CVPR.2013.457 Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf MIT web domain
spellingShingle	Song, Yale Morency, Louis-Philippe Davis, Randall Action Recognition by Hierarchical Sequence Summarization
title	Action Recognition by Hierarchical Sequence Summarization
title_full	Action Recognition by Hierarchical Sequence Summarization
title_fullStr	Action Recognition by Hierarchical Sequence Summarization
title_full_unstemmed	Action Recognition by Hierarchical Sequence Summarization
title_short	Action Recognition by Hierarchical Sequence Summarization
title_sort	action recognition by hierarchical sequence summarization
url	http://hdl.handle.net/1721.1/86123 https://orcid.org/0000-0001-5232-7281
work_keys_str_mv	AT songyale actionrecognitionbyhierarchicalsequencesummarization AT morencylouisphilippe actionrecognitionbyhierarchicalsequencesummarization AT davisrandall actionrecognitionbyhierarchicalsequencesummarization

Action Recognition by Hierarchical Sequence Summarization

Similar Items