A Facial Feature and Lip Movement Enhanced Audio-Visual Speech Separation Model

The cocktail party problem can be more effectively addressed by leveraging the speaker’s visual and audio information. This paper proposes a method to improve the audio’s separation using two visual cues: facial features and lip movement. Firstly, residual connections are introduced in the audio sep...

Full description

Bibliographic Details
Main Authors: Guizhu Li, Min Fu, Mengnan Sun, Xuefeng Liu, Bing Zheng
Format: Article
Language:English
Published: MDPI AG 2023-10-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/23/21/8770