Robust Multi-Scenario Speech-Based Emotion Recognition System

Every human being experiences emotions daily, e.g., joy, sadness, fear, anger. These might be revealed through speech—words are often accompanied by our emotional states when we talk. Different acoustic emotional databases are freely available for solving the Emotional Speech Recognition (ESR) task....

Full description

Bibliographic Details
Main Authors:	Fangfang Zhu-Zhou, Roberto Gil-Pita, Joaquín García-Gómez, Manuel Rosa-Zurera
Format:	Article
Language:	English
Published:	MDPI AG 2022-03-01
Series:	Sensors
Subjects:	affective computing emotion recognition speech emotions
Online Access:	https://www.mdpi.com/1424-8220/22/6/2343

_version_	1827625630894653440
author	Fangfang Zhu-Zhou Roberto Gil-Pita Joaquín García-Gómez Manuel Rosa-Zurera
author_facet	Fangfang Zhu-Zhou Roberto Gil-Pita Joaquín García-Gómez Manuel Rosa-Zurera
author_sort	Fangfang Zhu-Zhou
collection	DOAJ
description	Every human being experiences emotions daily, e.g., joy, sadness, fear, anger. These might be revealed through speech—words are often accompanied by our emotional states when we talk. Different acoustic emotional databases are freely available for solving the Emotional Speech Recognition (ESR) task. Unfortunately, many of them were generated under non-real-world conditions, i.e., actors played emotions, and recorded emotions were under fictitious circumstances where noise is non-existent. Another weakness in the design of emotion recognition systems is the scarcity of enough patterns in the available databases, causing generalization problems and leading to overfitting. This paper examines how different recording environmental elements impact system performance using a simple logistic regression algorithm. Specifically, we conducted experiments simulating different scenarios, using different levels of Gaussian white noise, real-world noise, and reverberation. The results from this research show a performance deterioration in all scenarios, increasing the error probability from 25.57% to 79.13% in the worst case. Additionally, a virtual enlargement method and a robust multi-scenario speech-based emotion recognition system are proposed. Our system’s average error probability of 34.57% is comparable to the best-case scenario with 31.55%. The findings support the prediction that simulated emotional speech databases do not offer sufficient closeness to real scenarios.
first_indexed	2024-03-09T12:40:20Z
format	Article
id	doaj.art-ee3b1151b9db47428fcba27149206076
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-09T12:40:20Z
publishDate	2022-03-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-ee3b1151b9db47428fcba271492060762023-11-30T22:19:44ZengMDPI AGSensors1424-82202022-03-01226234310.3390/s22062343Robust Multi-Scenario Speech-Based Emotion Recognition SystemFangfang Zhu-Zhou0Roberto Gil-Pita1Joaquín García-Gómez2Manuel Rosa-Zurera3Department of Signal Theory and Communications, University of Alcalá, 28805 Alcalá de Henares, Madrid, SpainDepartment of Signal Theory and Communications, University of Alcalá, 28805 Alcalá de Henares, Madrid, SpainDepartment of Signal Theory and Communications, University of Alcalá, 28805 Alcalá de Henares, Madrid, SpainDepartment of Signal Theory and Communications, University of Alcalá, 28805 Alcalá de Henares, Madrid, SpainEvery human being experiences emotions daily, e.g., joy, sadness, fear, anger. These might be revealed through speech—words are often accompanied by our emotional states when we talk. Different acoustic emotional databases are freely available for solving the Emotional Speech Recognition (ESR) task. Unfortunately, many of them were generated under non-real-world conditions, i.e., actors played emotions, and recorded emotions were under fictitious circumstances where noise is non-existent. Another weakness in the design of emotion recognition systems is the scarcity of enough patterns in the available databases, causing generalization problems and leading to overfitting. This paper examines how different recording environmental elements impact system performance using a simple logistic regression algorithm. Specifically, we conducted experiments simulating different scenarios, using different levels of Gaussian white noise, real-world noise, and reverberation. The results from this research show a performance deterioration in all scenarios, increasing the error probability from 25.57% to 79.13% in the worst case. Additionally, a virtual enlargement method and a robust multi-scenario speech-based emotion recognition system are proposed. Our system’s average error probability of 34.57% is comparable to the best-case scenario with 31.55%. The findings support the prediction that simulated emotional speech databases do not offer sufficient closeness to real scenarios.https://www.mdpi.com/1424-8220/22/6/2343affective computingemotion recognitionspeech emotions
spellingShingle	Fangfang Zhu-Zhou Roberto Gil-Pita Joaquín García-Gómez Manuel Rosa-Zurera Robust Multi-Scenario Speech-Based Emotion Recognition System Sensors affective computing emotion recognition speech emotions
title	Robust Multi-Scenario Speech-Based Emotion Recognition System
title_full	Robust Multi-Scenario Speech-Based Emotion Recognition System
title_fullStr	Robust Multi-Scenario Speech-Based Emotion Recognition System
title_full_unstemmed	Robust Multi-Scenario Speech-Based Emotion Recognition System
title_short	Robust Multi-Scenario Speech-Based Emotion Recognition System
title_sort	robust multi scenario speech based emotion recognition system
topic	affective computing emotion recognition speech emotions
url	https://www.mdpi.com/1424-8220/22/6/2343
work_keys_str_mv	AT fangfangzhuzhou robustmultiscenariospeechbasedemotionrecognitionsystem AT robertogilpita robustmultiscenariospeechbasedemotionrecognitionsystem AT joaquingarciagomez robustmultiscenariospeechbasedemotionrecognitionsystem AT manuelrosazurera robustmultiscenariospeechbasedemotionrecognitionsystem

Robust Multi-Scenario Speech-Based Emotion Recognition System

Similar Items