3D-ShuffleViT: An Efficient Video Action Recognition Network with Deep Integration of Self-Attention and Convolution

Compared with traditional methods, the action recognition model based on 3D convolutional deep neural network captures spatio-temporal features more accurately, resulting in higher accuracy. However, the large number of parameters and computational requirements of 3D models make it difficult to depl...

Full description

Bibliographic Details
Main Authors: Yinghui Wang, Anlei Zhu, Haomiao Ma, Lingyu Ai, Wei Song, Shaojie Zhang
Format: Article
Language:English
Published: MDPI AG 2023-09-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/11/18/3848