Multisensory Fusion for Unsupervised Spatiotemporal Speaker Diarization

Speaker diarization consists of answering the question of “who spoke when” in audio recordings. In meeting scenarios, the task of labeling audio with the corresponding speaker identities can be further assisted by the exploitation of spatial features. This work proposes a framework designed to asses...

Full description

Bibliographic Details
Main Authors: Paris Xylogiannis, Nikolaos Vryzas, Lazaros Vrysis, Charalampos Dimoulas
Format: Article
Language:English
Published: MDPI AG 2024-06-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/24/13/4229