WLiT: Windows and Linear Transformer for Video Action Recognition

The emergence of Transformer has led to the rapid development of video understanding, but it also brings the problem of high computational complexity. Previously, there were methods to divide the feature maps into windows along the spatiotemporal dimensions and then calculate the attention. There ar...

Full description

Bibliographic Details
Main Authors: Ruoxi Sun, Tianzhao Zhang, Yong Wan, Fuping Zhang, Jianming Wei
Format: Article
Language:English
Published: MDPI AG 2023-02-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/23/3/1616