PGCN-TCA: Pseudo Graph Convolutional Network With Temporal and Channel-Wise Attention for Skeleton-Based Action Recognition

Skeleton-based human action recognition has become an active research area in recent years. The key to this task is to fully explore both spatial and temporal features. Recently, GCN-based methods modeling the human body skeletons as spatial-temporal graphs, have achieved remarkable performances. Ho...

Full description

Bibliographic Details
Main Authors: Hongye Yang, Yuzhang Gu, Jianchao Zhu, Keli Hu, Xiaolin Zhang
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8950167/
_version_ 1818323688560263168
author Hongye Yang
Yuzhang Gu
Jianchao Zhu
Keli Hu
Xiaolin Zhang
author_facet Hongye Yang
Yuzhang Gu
Jianchao Zhu
Keli Hu
Xiaolin Zhang
author_sort Hongye Yang
collection DOAJ
description Skeleton-based human action recognition has become an active research area in recent years. The key to this task is to fully explore both spatial and temporal features. Recently, GCN-based methods modeling the human body skeletons as spatial-temporal graphs, have achieved remarkable performances. However, most GCN-based methods use a fixed adjacency matrix defined by the dataset, which can only capture the structural information provided by joints directly connected through bones and ignore the dependencies between distant joints that are not connected. In addition, such a fixed adjacency matrix used in all layers leads to the network failing to extract multi-level semantic features. In this paper we propose a pseudo graph convolutional network with temporal and channel-wise attention (PGCN-TCA) to solve this problem. The fixed normalized adjacent matrix is substituted with a learnable matrix. In this way, the matrix can learn the dependencies between connected joints and joints that are not physically connected. At the same time, learnable matrices in different layers can help the network capture multi-level features in spatial domain. Moreover, Since frames and input channels that contain outstanding characteristics play significant roles in distinguishing the action from others, we propose a mixed temporal and channel-wise attention. Our method achieves comparable performances to state-of-the-art methods on NTU-RGB+D and HDM05 datasets.
first_indexed 2024-12-13T11:16:40Z
format Article
id doaj.art-881ee6d774174355894e940e6636d373
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-13T11:16:40Z
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-881ee6d774174355894e940e6636d3732022-12-21T23:48:36ZengIEEEIEEE Access2169-35362020-01-018100401004710.1109/ACCESS.2020.29641158950167PGCN-TCA: Pseudo Graph Convolutional Network With Temporal and Channel-Wise Attention for Skeleton-Based Action RecognitionHongye Yang0https://orcid.org/0000-0002-4974-3476Yuzhang Gu1https://orcid.org/0000-0002-9935-5156Jianchao Zhu2https://orcid.org/0000-0003-0742-2102Keli Hu3https://orcid.org/0000-0002-5628-7640Xiaolin Zhang4https://orcid.org/0000-0003-3307-9838Bio-Vision System Laboratory, State Key Laboratory of Transducer Technology, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai, ChinaBio-Vision System Laboratory, State Key Laboratory of Transducer Technology, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai, ChinaSchool of Computer Science and Software Engineering, East China Normal University, Shanghai, ChinaDepartment of Computer Science and Engineering, Shaoxing University, Shaoxing, ChinaBio-Vision System Laboratory, State Key Laboratory of Transducer Technology, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai, ChinaSkeleton-based human action recognition has become an active research area in recent years. The key to this task is to fully explore both spatial and temporal features. Recently, GCN-based methods modeling the human body skeletons as spatial-temporal graphs, have achieved remarkable performances. However, most GCN-based methods use a fixed adjacency matrix defined by the dataset, which can only capture the structural information provided by joints directly connected through bones and ignore the dependencies between distant joints that are not connected. In addition, such a fixed adjacency matrix used in all layers leads to the network failing to extract multi-level semantic features. In this paper we propose a pseudo graph convolutional network with temporal and channel-wise attention (PGCN-TCA) to solve this problem. The fixed normalized adjacent matrix is substituted with a learnable matrix. In this way, the matrix can learn the dependencies between connected joints and joints that are not physically connected. At the same time, learnable matrices in different layers can help the network capture multi-level features in spatial domain. Moreover, Since frames and input channels that contain outstanding characteristics play significant roles in distinguishing the action from others, we propose a mixed temporal and channel-wise attention. Our method achieves comparable performances to state-of-the-art methods on NTU-RGB+D and HDM05 datasets.https://ieeexplore.ieee.org/document/8950167/Computer visionskeleton-based action recognitiontemporal and channel-wise attention
spellingShingle Hongye Yang
Yuzhang Gu
Jianchao Zhu
Keli Hu
Xiaolin Zhang
PGCN-TCA: Pseudo Graph Convolutional Network With Temporal and Channel-Wise Attention for Skeleton-Based Action Recognition
IEEE Access
Computer vision
skeleton-based action recognition
temporal and channel-wise attention
title PGCN-TCA: Pseudo Graph Convolutional Network With Temporal and Channel-Wise Attention for Skeleton-Based Action Recognition
title_full PGCN-TCA: Pseudo Graph Convolutional Network With Temporal and Channel-Wise Attention for Skeleton-Based Action Recognition
title_fullStr PGCN-TCA: Pseudo Graph Convolutional Network With Temporal and Channel-Wise Attention for Skeleton-Based Action Recognition
title_full_unstemmed PGCN-TCA: Pseudo Graph Convolutional Network With Temporal and Channel-Wise Attention for Skeleton-Based Action Recognition
title_short PGCN-TCA: Pseudo Graph Convolutional Network With Temporal and Channel-Wise Attention for Skeleton-Based Action Recognition
title_sort pgcn tca pseudo graph convolutional network with temporal and channel wise attention for skeleton based action recognition
topic Computer vision
skeleton-based action recognition
temporal and channel-wise attention
url https://ieeexplore.ieee.org/document/8950167/
work_keys_str_mv AT hongyeyang pgcntcapseudographconvolutionalnetworkwithtemporalandchannelwiseattentionforskeletonbasedactionrecognition
AT yuzhanggu pgcntcapseudographconvolutionalnetworkwithtemporalandchannelwiseattentionforskeletonbasedactionrecognition
AT jianchaozhu pgcntcapseudographconvolutionalnetworkwithtemporalandchannelwiseattentionforskeletonbasedactionrecognition
AT kelihu pgcntcapseudographconvolutionalnetworkwithtemporalandchannelwiseattentionforskeletonbasedactionrecognition
AT xiaolinzhang pgcntcapseudographconvolutionalnetworkwithtemporalandchannelwiseattentionforskeletonbasedactionrecognition