Seeing voices and hearing faces: Cross-modal biometric matching

<p>We introduce a seemingly impossible task: given only an audio clip of someone speaking, decide which of two face images is the speaker. In this paper we study this, and a number of related cross-modal tasks, aimed at answering the question: how much can we infer from the voice about the fac...

Ful tanımlama

Detaylı Bibliyografya
Asıl Yazarlar: Nagrani, A, Albanie, S, Zisserman, A
Materyal Türü: Conference item
Baskı/Yayın Bilgisi: Institute of Electrical and Electronics Engineers 2018