3D Behavior Recognition Based on Multi-Modal Deep Space-Time Learning

This paper proposes a dual-stream 3D space-time convolutional neural network action recognition framework. The original depth map sequence data is set as the input in order to study the global space-time characteristics of each action category. The high correlation within the human action itself is...

Full description

Bibliographic Details
Main Authors: Chong Zhao, Minglin Chen, Jinhao Zhao, Qicong Wang, Yehu Shen
Format: Article
Language:English
Published: MDPI AG 2019-02-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/9/4/716
Description
Summary:This paper proposes a dual-stream 3D space-time convolutional neural network action recognition framework. The original depth map sequence data is set as the input in order to study the global space-time characteristics of each action category. The high correlation within the human action itself is considered in the time domain, and then the deep motion map sequence is introduced as the input to another stream of the 3D space-time convolutional network. Furthermore, the corresponding 3D skeleton sequence data is set as the third input of the whole recognition framework. Although the skeleton sequence data has the advantage of including 3D information, it is also confronted with the problems of the existence of rate change, temporal mismatch and noise. Thus, specially designed space-time features are applied to cope with these problems. The proposed methods allow the whole recognition system to fully exploit and utilize the discriminatory space-time features from different perspectives, and ultimately improve the classification accuracy of the system. Experimental results on different public 3D data sets illustrate the effectiveness of the proposed method.
ISSN:2076-3417