Multimodal Affect Models: An Investigation of Relative Salience of Audio and Visual Cues for Emotion Prediction

People perceive emotions via multiple cues, predominantly speech and visual cues, and a number of emotion recognition systems utilize both audio and visual cues. Moreover, the perception of static aspects of emotion (speaker's arousal level is high/low) and the dynamic aspects of emotion (speak...

Full description

Bibliographic Details
Main Authors: Jingyao Wu, Ting Dang, Vidhyasaharan Sethu, Eliathamby Ambikairajah
Format: Article
Language:English
Published: Frontiers Media S.A. 2021-12-01
Series:Frontiers in Computer Science
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fcomp.2021.767767/full