PGCN-TCA: Pseudo Graph Convolutional Network With Temporal and Channel-Wise Attention for Skeleton-Based Action Recognition
Skeleton-based human action recognition has become an active research area in recent years. The key to this task is to fully explore both spatial and temporal features. Recently, GCN-based methods modeling the human body skeletons as spatial-temporal graphs, have achieved remarkable performances. Ho...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8950167/ |
_version_ | 1818323688560263168 |
---|---|
author | Hongye Yang Yuzhang Gu Jianchao Zhu Keli Hu Xiaolin Zhang |
author_facet | Hongye Yang Yuzhang Gu Jianchao Zhu Keli Hu Xiaolin Zhang |
author_sort | Hongye Yang |
collection | DOAJ |
description | Skeleton-based human action recognition has become an active research area in recent years. The key to this task is to fully explore both spatial and temporal features. Recently, GCN-based methods modeling the human body skeletons as spatial-temporal graphs, have achieved remarkable performances. However, most GCN-based methods use a fixed adjacency matrix defined by the dataset, which can only capture the structural information provided by joints directly connected through bones and ignore the dependencies between distant joints that are not connected. In addition, such a fixed adjacency matrix used in all layers leads to the network failing to extract multi-level semantic features. In this paper we propose a pseudo graph convolutional network with temporal and channel-wise attention (PGCN-TCA) to solve this problem. The fixed normalized adjacent matrix is substituted with a learnable matrix. In this way, the matrix can learn the dependencies between connected joints and joints that are not physically connected. At the same time, learnable matrices in different layers can help the network capture multi-level features in spatial domain. Moreover, Since frames and input channels that contain outstanding characteristics play significant roles in distinguishing the action from others, we propose a mixed temporal and channel-wise attention. Our method achieves comparable performances to state-of-the-art methods on NTU-RGB+D and HDM05 datasets. |
first_indexed | 2024-12-13T11:16:40Z |
format | Article |
id | doaj.art-881ee6d774174355894e940e6636d373 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-13T11:16:40Z |
publishDate | 2020-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-881ee6d774174355894e940e6636d3732022-12-21T23:48:36ZengIEEEIEEE Access2169-35362020-01-018100401004710.1109/ACCESS.2020.29641158950167PGCN-TCA: Pseudo Graph Convolutional Network With Temporal and Channel-Wise Attention for Skeleton-Based Action RecognitionHongye Yang0https://orcid.org/0000-0002-4974-3476Yuzhang Gu1https://orcid.org/0000-0002-9935-5156Jianchao Zhu2https://orcid.org/0000-0003-0742-2102Keli Hu3https://orcid.org/0000-0002-5628-7640Xiaolin Zhang4https://orcid.org/0000-0003-3307-9838Bio-Vision System Laboratory, State Key Laboratory of Transducer Technology, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai, ChinaBio-Vision System Laboratory, State Key Laboratory of Transducer Technology, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai, ChinaSchool of Computer Science and Software Engineering, East China Normal University, Shanghai, ChinaDepartment of Computer Science and Engineering, Shaoxing University, Shaoxing, ChinaBio-Vision System Laboratory, State Key Laboratory of Transducer Technology, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai, ChinaSkeleton-based human action recognition has become an active research area in recent years. The key to this task is to fully explore both spatial and temporal features. Recently, GCN-based methods modeling the human body skeletons as spatial-temporal graphs, have achieved remarkable performances. However, most GCN-based methods use a fixed adjacency matrix defined by the dataset, which can only capture the structural information provided by joints directly connected through bones and ignore the dependencies between distant joints that are not connected. In addition, such a fixed adjacency matrix used in all layers leads to the network failing to extract multi-level semantic features. In this paper we propose a pseudo graph convolutional network with temporal and channel-wise attention (PGCN-TCA) to solve this problem. The fixed normalized adjacent matrix is substituted with a learnable matrix. In this way, the matrix can learn the dependencies between connected joints and joints that are not physically connected. At the same time, learnable matrices in different layers can help the network capture multi-level features in spatial domain. Moreover, Since frames and input channels that contain outstanding characteristics play significant roles in distinguishing the action from others, we propose a mixed temporal and channel-wise attention. Our method achieves comparable performances to state-of-the-art methods on NTU-RGB+D and HDM05 datasets.https://ieeexplore.ieee.org/document/8950167/Computer visionskeleton-based action recognitiontemporal and channel-wise attention |
spellingShingle | Hongye Yang Yuzhang Gu Jianchao Zhu Keli Hu Xiaolin Zhang PGCN-TCA: Pseudo Graph Convolutional Network With Temporal and Channel-Wise Attention for Skeleton-Based Action Recognition IEEE Access Computer vision skeleton-based action recognition temporal and channel-wise attention |
title | PGCN-TCA: Pseudo Graph Convolutional Network With Temporal and Channel-Wise Attention for Skeleton-Based Action Recognition |
title_full | PGCN-TCA: Pseudo Graph Convolutional Network With Temporal and Channel-Wise Attention for Skeleton-Based Action Recognition |
title_fullStr | PGCN-TCA: Pseudo Graph Convolutional Network With Temporal and Channel-Wise Attention for Skeleton-Based Action Recognition |
title_full_unstemmed | PGCN-TCA: Pseudo Graph Convolutional Network With Temporal and Channel-Wise Attention for Skeleton-Based Action Recognition |
title_short | PGCN-TCA: Pseudo Graph Convolutional Network With Temporal and Channel-Wise Attention for Skeleton-Based Action Recognition |
title_sort | pgcn tca pseudo graph convolutional network with temporal and channel wise attention for skeleton based action recognition |
topic | Computer vision skeleton-based action recognition temporal and channel-wise attention |
url | https://ieeexplore.ieee.org/document/8950167/ |
work_keys_str_mv | AT hongyeyang pgcntcapseudographconvolutionalnetworkwithtemporalandchannelwiseattentionforskeletonbasedactionrecognition AT yuzhanggu pgcntcapseudographconvolutionalnetworkwithtemporalandchannelwiseattentionforskeletonbasedactionrecognition AT jianchaozhu pgcntcapseudographconvolutionalnetworkwithtemporalandchannelwiseattentionforskeletonbasedactionrecognition AT kelihu pgcntcapseudographconvolutionalnetworkwithtemporalandchannelwiseattentionforskeletonbasedactionrecognition AT xiaolinzhang pgcntcapseudographconvolutionalnetworkwithtemporalandchannelwiseattentionforskeletonbasedactionrecognition |