Attention Relational Network for Skeleton-Based Group Activity Recognition

Group activity recognition is a significant and challenging task in computer vision. The solution of group activity prediction can be classified with traditional hand-crafted features, RGB video features, and skeleton data-based deep learning architectures, such as Graph Convolutional Networks (GCNs...

Full description

Bibliographic Details
Main Authors: Chuanchuan Wang, Ahmad Sufril Azlan Mohamed
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10318128/
_version_ 1797473537441660928
author Chuanchuan Wang
Ahmad Sufril Azlan Mohamed
author_facet Chuanchuan Wang
Ahmad Sufril Azlan Mohamed
author_sort Chuanchuan Wang
collection DOAJ
description Group activity recognition is a significant and challenging task in computer vision. The solution of group activity prediction can be classified with traditional hand-crafted features, RGB video features, and skeleton data-based deep learning architectures, such as Graph Convolutional Networks (GCNs), Recurrent Neural Networks (RNNs), and Long Short-Term Memory (LSTMs). However, they rarely explore pose information and rarely use relational networks to reason about group activity behavior. In this work, we leverage minimal prior knowledge about the skeleton information to reason about the interactions from group activity. The objective is to obtain discriminative representations and filter out some ambiguous actions to enhance the performance of group activity recognition. Our contribution is a proposed Attention Relation Network (ARN) that fuses the attention mechanisms and joint vector sequences into the relation network. The skeleton joints vector sequences are previously unexplored pose information and assign greater significance attributed to individuals who are more relevant for distinguishing the group activity behavior. First, our model focuses on the specified edge-level information (encompassing both edge and edge motion data) within the skeleton dataset, considering directionality, to analyze the spatiotemporal aspects of the action. Second, recognizing the inherent motion directionality, we establish diverse directions for skeleton edges and extract distinct motion features (including translation and rotation information) aligned with these various orientations, thereby augmenting the utilization of motion attributes related to the action. We also introduce a representation of human motion achieved by combining relational networks and examining their integrated characteristics. Extensive experiments were tested in the Hockey and UT-interaction datasets to evaluate our method, obtaining competitive performance to the state-of-the-art. Results demonstrate the modeling potential of a skeleton-based method for group activity recognition.
first_indexed 2024-03-09T20:15:54Z
format Article
id doaj.art-c669ea1244a646738caf30e12f4bc7dd
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-09T20:15:54Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-c669ea1244a646738caf30e12f4bc7dd2023-11-24T00:01:00ZengIEEEIEEE Access2169-35362023-01-011112923012923910.1109/ACCESS.2023.333265110318128Attention Relational Network for Skeleton-Based Group Activity RecognitionChuanchuan Wang0https://orcid.org/0009-0001-8061-5368Ahmad Sufril Azlan Mohamed1https://orcid.org/0000-0002-2838-0872School of Computer Sciences, Universiti Sains Malaysia, George Town, Penang, MalaysiaSchool of Computer Sciences, Universiti Sains Malaysia, George Town, Penang, MalaysiaGroup activity recognition is a significant and challenging task in computer vision. The solution of group activity prediction can be classified with traditional hand-crafted features, RGB video features, and skeleton data-based deep learning architectures, such as Graph Convolutional Networks (GCNs), Recurrent Neural Networks (RNNs), and Long Short-Term Memory (LSTMs). However, they rarely explore pose information and rarely use relational networks to reason about group activity behavior. In this work, we leverage minimal prior knowledge about the skeleton information to reason about the interactions from group activity. The objective is to obtain discriminative representations and filter out some ambiguous actions to enhance the performance of group activity recognition. Our contribution is a proposed Attention Relation Network (ARN) that fuses the attention mechanisms and joint vector sequences into the relation network. The skeleton joints vector sequences are previously unexplored pose information and assign greater significance attributed to individuals who are more relevant for distinguishing the group activity behavior. First, our model focuses on the specified edge-level information (encompassing both edge and edge motion data) within the skeleton dataset, considering directionality, to analyze the spatiotemporal aspects of the action. Second, recognizing the inherent motion directionality, we establish diverse directions for skeleton edges and extract distinct motion features (including translation and rotation information) aligned with these various orientations, thereby augmenting the utilization of motion attributes related to the action. We also introduce a representation of human motion achieved by combining relational networks and examining their integrated characteristics. Extensive experiments were tested in the Hockey and UT-interaction datasets to evaluate our method, obtaining competitive performance to the state-of-the-art. Results demonstrate the modeling potential of a skeleton-based method for group activity recognition.https://ieeexplore.ieee.org/document/10318128/Group activity recognitionattention mechanismrelational networkskeleton joint director sequences
spellingShingle Chuanchuan Wang
Ahmad Sufril Azlan Mohamed
Attention Relational Network for Skeleton-Based Group Activity Recognition
IEEE Access
Group activity recognition
attention mechanism
relational network
skeleton joint director sequences
title Attention Relational Network for Skeleton-Based Group Activity Recognition
title_full Attention Relational Network for Skeleton-Based Group Activity Recognition
title_fullStr Attention Relational Network for Skeleton-Based Group Activity Recognition
title_full_unstemmed Attention Relational Network for Skeleton-Based Group Activity Recognition
title_short Attention Relational Network for Skeleton-Based Group Activity Recognition
title_sort attention relational network for skeleton based group activity recognition
topic Group activity recognition
attention mechanism
relational network
skeleton joint director sequences
url https://ieeexplore.ieee.org/document/10318128/
work_keys_str_mv AT chuanchuanwang attentionrelationalnetworkforskeletonbasedgroupactivityrecognition
AT ahmadsufrilazlanmohamed attentionrelationalnetworkforskeletonbasedgroupactivityrecognition