Target Speaker Extraction Using Attention-Enhanced Temporal Convolutional Network

When recording conversations, there may be multiple people talking at once. While our human ears can filter out unwanted sounds, this can be challenging for automatic speech recognition (ASR) systems, leading to reduced accuracy. To address this issue, preprocessing mechanisms such as speech separat...

Full description

Bibliographic Details
Main Authors:	Jian-Hong Wang, Yen-Ting Lai, Tzu-Chiang Tai, Phuong Thi Le, Tuan Pham, Ze-Yu Wang, Yung-Hui Li, Jia-Ching Wang, Pao-Chi Chang
Format:	Article
Language:	English
Published:	MDPI AG 2024-01-01
Series:	Electronics
Subjects:	deep learning target speaker extraction temporal convolutional network (TCN) convolutional neural network (CNN) automatic speech recognition (ASR)
Online Access:	https://www.mdpi.com/2079-9292/13/2/307

Internet

https://www.mdpi.com/2079-9292/13/2/307

Target Speaker Extraction Using Attention-Enhanced Temporal Convolutional Network

Internet

Similar Items