Speech Emotion Recognition Based on Parallel CNN-Attention Networks with Multi-Fold Data Augmentation

In this paper, an automatic speech emotion recognition (SER) task of classifying eight different emotions was experimented using parallel based networks trained using the Ryeson Audio-Visual Dataset of Speech and Song (RAVDESS) dataset. A combination of a CNN-based network and attention-based networ...

Full description

Bibliographic Details
Main Authors:	John Lorenzo Bautista, Yun Kyung Lee, Hyun Soon Shin
Format:	Article
Language:	English
Published:	MDPI AG 2022-11-01
Series:	Electronics
Subjects:	speech emotion recognition parallel networks attention-based network audio data augmentation transformer deep learning
Online Access:	https://www.mdpi.com/2079-9292/11/23/3935

Internet

https://www.mdpi.com/2079-9292/11/23/3935

Speech Emotion Recognition Based on Parallel CNN-Attention Networks with Multi-Fold Data Augmentation

Internet

Similar Items