Spot the conversation: Speaker diarisation in the wild

The goal of this paper is speaker diarisation of videos collected ‘in the wild’. <br> We make three key contributions. First, we propose an automatic audio-visual diarisation method for YouTube videos. Our method consists of active speaker detection using audio-visual methods and speaker verif...

Mô tả đầy đủ

Chi tiết về thư mục
Những tác giả chính: Chung, JS, Huh, J, Nagrani, A, Afouras, T, Zisserman, A
Định dạng: Conference item
Ngôn ngữ:English
Được phát hành: International Speech Communication Association 2020