A Facial Feature and Lip Movement Enhanced Audio-Visual Speech Separation Model

The cocktail party problem can be more effectively addressed by leveraging the speaker’s visual and audio information. This paper proposes a method to improve the audio’s separation using two visual cues: facial features and lip movement. Firstly, residual connections are introduced in the audio sep...

Full description

Bibliographic Details
Main Authors:	Guizhu Li, Min Fu, Mengnan Sun, Xuefeng Liu, Bing Zheng
Format:	Article
Language:	English
Published:	MDPI AG 2023-10-01
Series:	Sensors
Subjects:	speech separation audio-visual attention mechanism U-Net
Online Access:	https://www.mdpi.com/1424-8220/23/21/8770

Internet

https://www.mdpi.com/1424-8220/23/21/8770

A Facial Feature and Lip Movement Enhanced Audio-Visual Speech Separation Model

Internet

Similar Items