Emotion recognition in speech using cross-modal transfer in the wild
Obtaining large, human labelled speech datasets to train models for emotion recognition is a notoriously challenging task, hindered by annotation cost and label ambiguity. In this work, we consider the task of learning embeddings for speech classification without access to any form of labelled audio...
Main Authors: | Albanie, S, Nagrani, A, Vedaldi, A, Zisserman, A |
---|---|
Format: | Internet publication |
Language: | English |
Published: |
arXiv
2018
|
Similar Items
-
Emotion Recognition in Speech using Cross-Modal Transfer in the Wild
by: Albanie, S, et al.
Published: (2018) -
Disentangled Speech Embeddings Using Cross-Modal Self-Supervision
by: Nagrani, A, et al.
Published: (2020) -
Learnable PINs: Cross-modal embeddings for person identity
by: Nagrani, A, et al.
Published: (2018) -
Seeing voices and hearing faces: Cross-modal biometric matching
by: Nagrani, A, et al.
Published: (2018) -
Speech2Action: Cross-modal supervision for action recognition
by: Nagrani, A, et al.
Published: (2020)