Multimodal human behavior analysis: Learning correlation and interaction across modalities

Multimodal human behavior analysis is a challenging task due to the presence of complex nonlinear correlations and interactions across modalities. We present a novel approach to this problem based on Kernel Canonical Correlation Analysis (KCCA) and Multi-view Hidden Conditional Random Fields (MV-HCR...

Full description

Bibliographic Details
Main Authors:	Song, Yale, Morency, Louis-Philippe, Davis, Randall
Other Authors:	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format:	Article
Language:	en_US
Published:	2014
Online Access:	http://hdl.handle.net/1721.1/86099 https://orcid.org/0000-0001-5232-7281

_version_	1811079139174645760
author	Song, Yale Morency, Louis-Philippe Davis, Randall
author2	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Song, Yale Morency, Louis-Philippe Davis, Randall
author_sort	Song, Yale
collection	MIT
description	Multimodal human behavior analysis is a challenging task due to the presence of complex nonlinear correlations and interactions across modalities. We present a novel approach to this problem based on Kernel Canonical Correlation Analysis (KCCA) and Multi-view Hidden Conditional Random Fields (MV-HCRF). Our approach uses a nonlinear kernel to map multimodal data to a high-dimensional feature space and finds a new projection of the data that maximizes the correlation across modalities. We use a multi-chain structured graphical model with disjoint sets of latent variables, one set per modality, to jointly learn both view-shared and view-specific sub-structures of the projected data, capturing interaction across modalities explicitly. We evaluate our approach on a task of agreement and disagreement recognition from nonverbal audio-visual cues using the Canal 9 dataset. Experimental results show that KCCA makes capturing nonlinear hidden dynamics easier and MV-HCRF helps learning interaction across modalities.
first_indexed	2024-09-23T11:10:36Z
format	Article
id	mit-1721.1/86099
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2024-09-23T11:10:36Z
publishDate	2014
record_format	dspace
spelling	mit-1721.1/860992022-09-27T17:37:30Z Multimodal human behavior analysis: Learning correlation and interaction across modalities Song, Yale Morency, Louis-Philippe Davis, Randall Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Song, Yale Davis, Randall Multimodal human behavior analysis is a challenging task due to the presence of complex nonlinear correlations and interactions across modalities. We present a novel approach to this problem based on Kernel Canonical Correlation Analysis (KCCA) and Multi-view Hidden Conditional Random Fields (MV-HCRF). Our approach uses a nonlinear kernel to map multimodal data to a high-dimensional feature space and finds a new projection of the data that maximizes the correlation across modalities. We use a multi-chain structured graphical model with disjoint sets of latent variables, one set per modality, to jointly learn both view-shared and view-specific sub-structures of the projected data, capturing interaction across modalities explicitly. We evaluate our approach on a task of agreement and disagreement recognition from nonverbal audio-visual cues using the Canal 9 dataset. Experimental results show that KCCA makes capturing nonlinear hidden dynamics easier and MV-HCRF helps learning interaction across modalities. United States. Office of Naval Research (Grant N000140910625) National Science Foundation (U.S.) (Grant IIS-1118018) National Science Foundation (U.S.) (Grant IIS-1018055) United States. Army Research, Development, and Engineering Command 2014-04-11T14:20:52Z 2014-04-11T14:20:52Z 2012-10 Article http://purl.org/eprint/type/ConferencePaper 9781450314671 http://hdl.handle.net/1721.1/86099 Yale Song, Louis-Philippe Morency, and Randall Davis. 2012. Multimodal human behavior analysis: learning correlation and interaction across modalities. In Proceedings of the 14th ACM international conference on Multimodal interaction (ICMI '12). ACM, New York, NY, USA, 27-30. https://orcid.org/0000-0001-5232-7281 en_US http://dx.doi.org/10.1145/2388676.2388684 Proceedings of the 14th ACM international conference on Multimodal interaction (ICMI '12) Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf MIT web domain
spellingShingle	Song, Yale Morency, Louis-Philippe Davis, Randall Multimodal human behavior analysis: Learning correlation and interaction across modalities
title	Multimodal human behavior analysis: Learning correlation and interaction across modalities
title_full	Multimodal human behavior analysis: Learning correlation and interaction across modalities
title_fullStr	Multimodal human behavior analysis: Learning correlation and interaction across modalities
title_full_unstemmed	Multimodal human behavior analysis: Learning correlation and interaction across modalities
title_short	Multimodal human behavior analysis: Learning correlation and interaction across modalities
title_sort	multimodal human behavior analysis learning correlation and interaction across modalities
url	http://hdl.handle.net/1721.1/86099 https://orcid.org/0000-0001-5232-7281
work_keys_str_mv	AT songyale multimodalhumanbehavioranalysislearningcorrelationandinteractionacrossmodalities AT morencylouisphilippe multimodalhumanbehavioranalysislearningcorrelationandinteractionacrossmodalities AT davisrandall multimodalhumanbehavioranalysislearningcorrelationandinteractionacrossmodalities

Multimodal human behavior analysis: Learning correlation and interaction across modalities

Similar Items