A Deep Attention Model for Action Recognition from Skeleton Data

This paper presents a new IndRNN-based deep attention model, termed DA-IndRNN, for skeleton-based action recognition to effectively model the fact that different joints are usually of different degrees of importance to different action categories. The model consists of (a) a deep IndRNN as the main...

Full description

Bibliographic Details
Main Authors: Yanbo Gao, Chuankun Li, Shuai Li, Xun Cai, Mao Ye, Hui Yuan
Format: Article
Language:English
Published: MDPI AG 2022-02-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/4/2006
_version_ 1797483041548926976
author Yanbo Gao
Chuankun Li
Shuai Li
Xun Cai
Mao Ye
Hui Yuan
author_facet Yanbo Gao
Chuankun Li
Shuai Li
Xun Cai
Mao Ye
Hui Yuan
author_sort Yanbo Gao
collection DOAJ
description This paper presents a new IndRNN-based deep attention model, termed DA-IndRNN, for skeleton-based action recognition to effectively model the fact that different joints are usually of different degrees of importance to different action categories. The model consists of (a) a deep IndRNN as the main classification network to overcome the limitation of a shallow RNN network in order to obtain deeper and longer features, and (b) a deep attention network with multiple fully connected layers to estimate reliable attention weights. To train the DA-IndRNN, a new triplet loss function is proposed to guide the learning of the attention among different action categories. Specifically, this triplet loss enforces intra-class attention distances to be smaller than inter-class attention distances and at the same time to allow multiple attention weight patterns to exist for the same class. The proposed DA-IndRNN can be trained end-to-end. Experiments on the widely used datasets, including the NTU RGB + D dataset and UOW Large-Scale Combined (LSC) Dataset, have demonstrated that the proposed method can achieve better and stable performance than the state-of-the-art attention models.
first_indexed 2024-03-09T22:41:13Z
format Article
id doaj.art-02f48c072357490895a7b463c02be989
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-09T22:41:13Z
publishDate 2022-02-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-02f48c072357490895a7b463c02be9892023-11-23T18:37:41ZengMDPI AGApplied Sciences2076-34172022-02-01124200610.3390/app12042006A Deep Attention Model for Action Recognition from Skeleton DataYanbo Gao0Chuankun Li1Shuai Li2Xun Cai3Mao Ye4Hui Yuan5School of Software, Shandong University, Jinan 250101, ChinaState Key Laboratory of Dynamic Testing Technology, School of Information and Communication Engineering, North University of China, Taiyuan 030051, ChinaSchool of Control Science and Engineering, Shandong University, Jinan 250100, ChinaSchool of Software, Shandong University, Jinan 250101, ChinaSchool of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Control Science and Engineering, Shandong University, Jinan 250100, ChinaThis paper presents a new IndRNN-based deep attention model, termed DA-IndRNN, for skeleton-based action recognition to effectively model the fact that different joints are usually of different degrees of importance to different action categories. The model consists of (a) a deep IndRNN as the main classification network to overcome the limitation of a shallow RNN network in order to obtain deeper and longer features, and (b) a deep attention network with multiple fully connected layers to estimate reliable attention weights. To train the DA-IndRNN, a new triplet loss function is proposed to guide the learning of the attention among different action categories. Specifically, this triplet loss enforces intra-class attention distances to be smaller than inter-class attention distances and at the same time to allow multiple attention weight patterns to exist for the same class. The proposed DA-IndRNN can be trained end-to-end. Experiments on the widely used datasets, including the NTU RGB + D dataset and UOW Large-Scale Combined (LSC) Dataset, have demonstrated that the proposed method can achieve better and stable performance than the state-of-the-art attention models.https://www.mdpi.com/2076-3417/12/4/2006skeleton-based action recognitionIndRNNRNNattention model
spellingShingle Yanbo Gao
Chuankun Li
Shuai Li
Xun Cai
Mao Ye
Hui Yuan
A Deep Attention Model for Action Recognition from Skeleton Data
Applied Sciences
skeleton-based action recognition
IndRNN
RNN
attention model
title A Deep Attention Model for Action Recognition from Skeleton Data
title_full A Deep Attention Model for Action Recognition from Skeleton Data
title_fullStr A Deep Attention Model for Action Recognition from Skeleton Data
title_full_unstemmed A Deep Attention Model for Action Recognition from Skeleton Data
title_short A Deep Attention Model for Action Recognition from Skeleton Data
title_sort deep attention model for action recognition from skeleton data
topic skeleton-based action recognition
IndRNN
RNN
attention model
url https://www.mdpi.com/2076-3417/12/4/2006
work_keys_str_mv AT yanbogao adeepattentionmodelforactionrecognitionfromskeletondata
AT chuankunli adeepattentionmodelforactionrecognitionfromskeletondata
AT shuaili adeepattentionmodelforactionrecognitionfromskeletondata
AT xuncai adeepattentionmodelforactionrecognitionfromskeletondata
AT maoye adeepattentionmodelforactionrecognitionfromskeletondata
AT huiyuan adeepattentionmodelforactionrecognitionfromskeletondata
AT yanbogao deepattentionmodelforactionrecognitionfromskeletondata
AT chuankunli deepattentionmodelforactionrecognitionfromskeletondata
AT shuaili deepattentionmodelforactionrecognitionfromskeletondata
AT xuncai deepattentionmodelforactionrecognitionfromskeletondata
AT maoye deepattentionmodelforactionrecognitionfromskeletondata
AT huiyuan deepattentionmodelforactionrecognitionfromskeletondata