Enhanced Speech Emotion Recognition Using DCGAN-Based Data Augmentation
Although emotional speech recognition has received increasing emphasis in research and applications, it remains challenging due to the diversity and complexity of emotions and limited datasets. To address these limitations, we propose a novel approach utilizing DCGAN to augment data from the RAVDESS...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-09-01
|
Series: | Electronics |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-9292/12/18/3966 |
Summary: | Although emotional speech recognition has received increasing emphasis in research and applications, it remains challenging due to the diversity and complexity of emotions and limited datasets. To address these limitations, we propose a novel approach utilizing DCGAN to augment data from the RAVDESS and EmoDB databases. Then, we assess the efficacy of emotion recognition using mel-spectrogram data by utilizing a model that combines CNN and BiLSTM. The preliminary experimental results reveal that the suggested technique contributes to enhancing the emotional speech identification performance. The results of this study provide directions for further development in the field of emotional speech recognition and the potential for practical applications. |
---|---|
ISSN: | 2079-9292 |