Human interaction recognition fusing multiple features of depth sequences

Human interaction recognition has played a major role in building intelligent video surveillance systems. Recently, depth data captured by the emerging RGB‐D sensors began to show its importability in human interaction recognition. This study proposes a novel framework for human interaction recognit...

Full description

Bibliographic Details
Main Authors: Jianjun Li, Xia Mao, Lijiang Chen, Lan Wang
Format: Article
Language:English
Published: Wiley 2017-10-01
Series:IET Computer Vision
Subjects:
Online Access:https://doi.org/10.1049/iet-cvi.2017.0025
Description
Summary:Human interaction recognition has played a major role in building intelligent video surveillance systems. Recently, depth data captured by the emerging RGB‐D sensors began to show its importability in human interaction recognition. This study proposes a novel framework for human interaction recognition using depth information including an algorithm to reconstruct depth sequence with as few key frames as possible. The proposed framework includes two essential modules. First, key frames extraction by sparse constraint, then the fusion multi‐feature, is constructed by using two types of available features and Max‐pooling, respectively. Finally, multiple features are directly sent to the SVM for the recognition of the human activity. This study explores the static and dynamic feature fusion method to improve the recognition performance with contextual relevance of continuous frames. A weight is used to fuse shape and optical flow features, which not only enhance the description capability of human behavioural characteristics in the spatiotemporal domain, but also effectively reduces the adverse impact of certain distortion point of interest for target recognition. Experimental results show that the proposed approach yields considerable performance improvement over the state‐of‐the‐art approaches with respect to accuracy on a public action dataset.
ISSN:1751-9632
1751-9640