Emotion recognition in speech using cross-modal transfer in the wild

Obtaining large, human labelled speech datasets to train models for emotion recognition is a notoriously challenging task, hindered by annotation cost and label ambiguity. In this work, we consider the task of learning embeddings for speech classification without access to any form of labelled audio...

Full description

Bibliographic Details
Main Authors: Albanie, S, Nagrani, A, Vedaldi, A, Zisserman, A
Format: Internet publication
Language:English
Published: arXiv 2018