Sound Can Help Us See More Clearly

In the field of video action classification, existing network frameworks often only use video frames as input. When the object involved in the action does not appear in a prominent position in the video frame, the network cannot accurately classify it. We introduce a new neural network structure tha...

Full description

Bibliographic Details
Main Authors: Yongsheng Li, Tengfei Tu, Hua Zhang, Jishuai Li, Zhengping Jin, Qiaoyan Wen
Format: Article
Language:English
Published: MDPI AG 2022-01-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/22/2/599