Cross-corpus speech emotion recognition using subspace learning and domain adaption
Abstract Speech emotion recognition (SER) is a hot topic in speech signal processing. When the training data and the test data come from different corpus, their feature distributions are different, which leads to the degradation of the recognition performance. Therefore, in order to solve this probl...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2022-12-01
|
Series: | EURASIP Journal on Audio, Speech, and Music Processing |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13636-022-00264-5 |
_version_ | 1828083285071233024 |
---|---|
author | Xuan Cao Maoshen Jia Jiawei Ru Tun-wen Pai |
author_facet | Xuan Cao Maoshen Jia Jiawei Ru Tun-wen Pai |
author_sort | Xuan Cao |
collection | DOAJ |
description | Abstract Speech emotion recognition (SER) is a hot topic in speech signal processing. When the training data and the test data come from different corpus, their feature distributions are different, which leads to the degradation of the recognition performance. Therefore, in order to solve this problem, a cross-corpus speech emotion recognition method is proposed based on subspace learning and domain adaptation in this paper. Specifically, training set data and the test set data are used to form the source domain and target domain, respectively. Then, the Hessian matrix is introduced to obtain the subspace for the extracted features in both source and target domains. In addition, an information entropy-based domain adaption method is introduced to construct the common space. In the common space, the difference between the feature distributions in the source domain and target domain is reduced as much as possible. To evaluate the performance of the proposed method, extensive experiments are conducted on cross-corpus speech emotion recognition. Experimental results show that the proposed method achieves better performance compared with some existing subspace learning and domain adaptation methods. |
first_indexed | 2024-04-11T04:06:37Z |
format | Article |
id | doaj.art-b68453e7c64b4d159f1c2f38386f9d7d |
institution | Directory Open Access Journal |
issn | 1687-4722 |
language | English |
last_indexed | 2024-04-11T04:06:37Z |
publishDate | 2022-12-01 |
publisher | SpringerOpen |
record_format | Article |
series | EURASIP Journal on Audio, Speech, and Music Processing |
spelling | doaj.art-b68453e7c64b4d159f1c2f38386f9d7d2023-01-01T12:24:07ZengSpringerOpenEURASIP Journal on Audio, Speech, and Music Processing1687-47222022-12-012022112010.1186/s13636-022-00264-5Cross-corpus speech emotion recognition using subspace learning and domain adaptionXuan Cao0Maoshen Jia1Jiawei Ru2Tun-wen Pai3Faculty of Information Technology, Beijing University of TechnologyFaculty of Information Technology, Beijing University of TechnologyFaculty of Information Technology, Beijing University of TechnologyDepartment of Computer Science and Information Engineering, National Taipei University of TechnologyAbstract Speech emotion recognition (SER) is a hot topic in speech signal processing. When the training data and the test data come from different corpus, their feature distributions are different, which leads to the degradation of the recognition performance. Therefore, in order to solve this problem, a cross-corpus speech emotion recognition method is proposed based on subspace learning and domain adaptation in this paper. Specifically, training set data and the test set data are used to form the source domain and target domain, respectively. Then, the Hessian matrix is introduced to obtain the subspace for the extracted features in both source and target domains. In addition, an information entropy-based domain adaption method is introduced to construct the common space. In the common space, the difference between the feature distributions in the source domain and target domain is reduced as much as possible. To evaluate the performance of the proposed method, extensive experiments are conducted on cross-corpus speech emotion recognition. Experimental results show that the proposed method achieves better performance compared with some existing subspace learning and domain adaptation methods.https://doi.org/10.1186/s13636-022-00264-5Speech emotion recognitionCross-corpusSubspace learningDomain adaption |
spellingShingle | Xuan Cao Maoshen Jia Jiawei Ru Tun-wen Pai Cross-corpus speech emotion recognition using subspace learning and domain adaption EURASIP Journal on Audio, Speech, and Music Processing Speech emotion recognition Cross-corpus Subspace learning Domain adaption |
title | Cross-corpus speech emotion recognition using subspace learning and domain adaption |
title_full | Cross-corpus speech emotion recognition using subspace learning and domain adaption |
title_fullStr | Cross-corpus speech emotion recognition using subspace learning and domain adaption |
title_full_unstemmed | Cross-corpus speech emotion recognition using subspace learning and domain adaption |
title_short | Cross-corpus speech emotion recognition using subspace learning and domain adaption |
title_sort | cross corpus speech emotion recognition using subspace learning and domain adaption |
topic | Speech emotion recognition Cross-corpus Subspace learning Domain adaption |
url | https://doi.org/10.1186/s13636-022-00264-5 |
work_keys_str_mv | AT xuancao crosscorpusspeechemotionrecognitionusingsubspacelearninganddomainadaption AT maoshenjia crosscorpusspeechemotionrecognitionusingsubspacelearninganddomainadaption AT jiaweiru crosscorpusspeechemotionrecognitionusingsubspacelearninganddomainadaption AT tunwenpai crosscorpusspeechemotionrecognitionusingsubspacelearninganddomainadaption |