Utterance-level aggregation for speaker recognition in the wild
The objective of this paper is speaker recognition `in the wild' - where utterances may be of variable length and also contain irrelevant signals. Crucial elements in the design of deep networks for this task are the type of trunk (frame level) network, and the method of temporal aggregation. W...
Prif Awduron: | Xie, W, Nagrani, A, Chung, J, Zisserman, A |
---|---|
Fformat: | Conference item |
Cyhoeddwyd: |
IEEE
2019
|
Eitemau Tebyg
-
Voxceleb: large-scale speaker verification in the wild
gan: Nagrani, A, et al.
Cyhoeddwyd: (2019) -
Spot the conversation: Speaker diarisation in the wild
gan: Chung, JS, et al.
Cyhoeddwyd: (2020) -
VoxCeleb2: Deep speaker recognition
gan: Chung, J, et al.
Cyhoeddwyd: (2018) -
The VoxCeleb speaker recognition challenge: a retrospective
gan: Huh, J, et al.
Cyhoeddwyd: (2024) -
VoxCeleb: a large-scale speaker identification dataset
gan: Nagrani, A, et al.
Cyhoeddwyd: (2017)