Multimodal Affect Models: An Investigation of Relative Salience of Audio and Visual Cues for Emotion Prediction
People perceive emotions via multiple cues, predominantly speech and visual cues, and a number of emotion recognition systems utilize both audio and visual cues. Moreover, the perception of static aspects of emotion (speaker's arousal level is high/low) and the dynamic aspects of emotion (speak...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2021-12-01
|
Series: | Frontiers in Computer Science |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fcomp.2021.767767/full |