Attention-Based Temporal-Frequency Aggregation for Speaker Verification

Convolutional neural networks (CNNs) have significantly promoted the development of speaker verification (SV) systems because of their powerful deep feature learning capability. In CNN-based SV systems, utterance-level aggregation is an important component, and it compresses the frame-level features...

Full description

Bibliographic Details
Main Authors: Meng Wang, Dazheng Feng, Tingting Su, Mohan Chen
Format: Article
Language:English
Published: MDPI AG 2022-03-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/22/6/2147