Mapping Discrete Emotions in the Dimensional Space: An Acoustic Approach

A frequently used procedure to examine the relationship between categorical and dimensional descriptions of emotions is to ask subjects to place verbal expressions representing emotions in a continuous multidimensional emotional space. This work chooses a different approach. It aims at creating a sy...

Full description

Bibliographic Details
Main Authors: Marián Trnka, Sakhia Darjaa, Marian Ritomský, Róbert Sabo, Milan Rusko, Meilin Schaper, Tim H. Stelkens-Kobsch
Format: Article
Language:English
Published: MDPI AG 2021-11-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/10/23/2950
_version_ 1797507952797548544
author Marián Trnka
Sakhia Darjaa
Marian Ritomský
Róbert Sabo
Milan Rusko
Meilin Schaper
Tim H. Stelkens-Kobsch
author_facet Marián Trnka
Sakhia Darjaa
Marian Ritomský
Róbert Sabo
Milan Rusko
Meilin Schaper
Tim H. Stelkens-Kobsch
author_sort Marián Trnka
collection DOAJ
description A frequently used procedure to examine the relationship between categorical and dimensional descriptions of emotions is to ask subjects to place verbal expressions representing emotions in a continuous multidimensional emotional space. This work chooses a different approach. It aims at creating a system predicting the values of Activation and Valence (AV) directly from the sound of emotional speech utterances without the use of its semantic content or any other additional information. The system uses X-vectors to represent sound characteristics of the utterance and Support Vector Regressor for the estimation the AV values. The system is trained on a pool of three publicly available databases with dimensional annotation of emotions. The quality of regression is evaluated on the test sets of the same databases. Mapping of categorical emotions to the dimensional space is tested on another pool of eight categorically annotated databases. The aim of the work was to test whether in each unseen database the predicted values of Valence and Activation will place emotion-tagged utterances in the AV space in accordance with expectations based on Russell’s circumplex model of affective space. Due to the great variability of speech data, clusters of emotions create overlapping clouds. Their average location can be represented by centroids. A hypothesis on the position of these centroids is formulated and evaluated. The system’s ability to separate the emotions is evaluated by measuring the distance of the centroids. It can be concluded that the system works as expected and the positions of the clusters follow the hypothesized rules. Although the variance in individual measurements is still very high and the overlap of emotion clusters is large, it can be stated that the AV coordinates predicted by the system lead to an observable separation of the emotions in accordance with the hypothesis. Knowledge from training databases can therefore be used to predict AV coordinates of unseen data of various origins. This could be used to detect high levels of stress or depression. With the appearance of more dimensionally annotated training data, the systems predicting emotional dimensions from speech sound will become more robust and usable in practical applications in call-centers, avatars, robots, information-providing systems, security applications, and the like.
first_indexed 2024-03-10T04:55:44Z
format Article
id doaj.art-dfe5465800bb4012ba1ea0bc461d4abc
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-03-10T04:55:44Z
publishDate 2021-11-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-dfe5465800bb4012ba1ea0bc461d4abc2023-11-23T02:16:35ZengMDPI AGElectronics2079-92922021-11-011023295010.3390/electronics10232950Mapping Discrete Emotions in the Dimensional Space: An Acoustic ApproachMarián Trnka0Sakhia Darjaa1Marian Ritomský2Róbert Sabo3Milan Rusko4Meilin Schaper5Tim H. Stelkens-Kobsch6Institute of Informatics of the Slovak Academy of Sciences, 845 07 Bratislava, SlovakiaInstitute of Informatics of the Slovak Academy of Sciences, 845 07 Bratislava, SlovakiaInstitute of Informatics of the Slovak Academy of Sciences, 845 07 Bratislava, SlovakiaInstitute of Informatics of the Slovak Academy of Sciences, 845 07 Bratislava, SlovakiaInstitute of Informatics of the Slovak Academy of Sciences, 845 07 Bratislava, SlovakiaInstitute of Flight Guidance, German Aerospace Center, 38108 Braunschweig, GermanyInstitute of Flight Guidance, German Aerospace Center, 38108 Braunschweig, GermanyA frequently used procedure to examine the relationship between categorical and dimensional descriptions of emotions is to ask subjects to place verbal expressions representing emotions in a continuous multidimensional emotional space. This work chooses a different approach. It aims at creating a system predicting the values of Activation and Valence (AV) directly from the sound of emotional speech utterances without the use of its semantic content or any other additional information. The system uses X-vectors to represent sound characteristics of the utterance and Support Vector Regressor for the estimation the AV values. The system is trained on a pool of three publicly available databases with dimensional annotation of emotions. The quality of regression is evaluated on the test sets of the same databases. Mapping of categorical emotions to the dimensional space is tested on another pool of eight categorically annotated databases. The aim of the work was to test whether in each unseen database the predicted values of Valence and Activation will place emotion-tagged utterances in the AV space in accordance with expectations based on Russell’s circumplex model of affective space. Due to the great variability of speech data, clusters of emotions create overlapping clouds. Their average location can be represented by centroids. A hypothesis on the position of these centroids is formulated and evaluated. The system’s ability to separate the emotions is evaluated by measuring the distance of the centroids. It can be concluded that the system works as expected and the positions of the clusters follow the hypothesized rules. Although the variance in individual measurements is still very high and the overlap of emotion clusters is large, it can be stated that the AV coordinates predicted by the system lead to an observable separation of the emotions in accordance with the hypothesis. Knowledge from training databases can therefore be used to predict AV coordinates of unseen data of various origins. This could be used to detect high levels of stress or depression. With the appearance of more dimensionally annotated training data, the systems predicting emotional dimensions from speech sound will become more robust and usable in practical applications in call-centers, avatars, robots, information-providing systems, security applications, and the like.https://www.mdpi.com/2079-9292/10/23/2950emotion recognitiondimensional to categorical emotion representation mappingactivationarousal and valence regressionX-vectorsSVM
spellingShingle Marián Trnka
Sakhia Darjaa
Marian Ritomský
Róbert Sabo
Milan Rusko
Meilin Schaper
Tim H. Stelkens-Kobsch
Mapping Discrete Emotions in the Dimensional Space: An Acoustic Approach
Electronics
emotion recognition
dimensional to categorical emotion representation mapping
activation
arousal and valence regression
X-vectors
SVM
title Mapping Discrete Emotions in the Dimensional Space: An Acoustic Approach
title_full Mapping Discrete Emotions in the Dimensional Space: An Acoustic Approach
title_fullStr Mapping Discrete Emotions in the Dimensional Space: An Acoustic Approach
title_full_unstemmed Mapping Discrete Emotions in the Dimensional Space: An Acoustic Approach
title_short Mapping Discrete Emotions in the Dimensional Space: An Acoustic Approach
title_sort mapping discrete emotions in the dimensional space an acoustic approach
topic emotion recognition
dimensional to categorical emotion representation mapping
activation
arousal and valence regression
X-vectors
SVM
url https://www.mdpi.com/2079-9292/10/23/2950
work_keys_str_mv AT mariantrnka mappingdiscreteemotionsinthedimensionalspaceanacousticapproach
AT sakhiadarjaa mappingdiscreteemotionsinthedimensionalspaceanacousticapproach
AT marianritomsky mappingdiscreteemotionsinthedimensionalspaceanacousticapproach
AT robertsabo mappingdiscreteemotionsinthedimensionalspaceanacousticapproach
AT milanrusko mappingdiscreteemotionsinthedimensionalspaceanacousticapproach
AT meilinschaper mappingdiscreteemotionsinthedimensionalspaceanacousticapproach
AT timhstelkenskobsch mappingdiscreteemotionsinthedimensionalspaceanacousticapproach