Data Augmentation and Effective Feature Selection in Generative Adversarial Networks for Speech Emotion Recognition
Until now, there has been no certainty based on the success or failure of using feature selection methods to increase the efficiency of SER systems. This article discusses feature selection for data augmentation in a speech emotion recognition system. The experiments were performed on four databases...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | fas |
Published: |
Semnan University
2023-03-01
|
Series: | مجله مدل سازی در مهندسی |
Subjects: | |
Online Access: | https://modelling.semnan.ac.ir/article_7526_d6a743149597c221d42f4f8dbf3521ea.pdf |
_version_ | 1797296450053341184 |
---|---|
author | Arash Shilandari Hossein Marvi Hossein Khosravi |
author_facet | Arash Shilandari Hossein Marvi Hossein Khosravi |
author_sort | Arash Shilandari |
collection | DOAJ |
description | Until now, there has been no certainty based on the success or failure of using feature selection methods to increase the efficiency of SER systems. This article discusses feature selection for data augmentation in a speech emotion recognition system. The experiments were performed on four databases: EMO-DB, eNTERFACE05, SAVEE, and IEMOCAP. Simulations are performed in Python software and in addition, data analysis was performed on all four databases for four emotions of sadness, anger, happiness, and neutral. This paper discusses feature selection intending to create a GAN to augment data in a speech emotion recognition system. It will demonstrate that artificial data generated by GANs can not only augment data but also can be used to feature selection to improve classification performance. We used a GAN to augment data and selected two feature-selective networks including Fisher and LDA algorithm in two steps. SVM was also used to classify emotions. With the feedback taken from the classification network, we could bring the SER system to the optimal point of sample number and feature vector dimensions. The PCA is more effective on correlated data. The LDA algorithm works better on low-dimensional data. Fisher's method is better at reducing size than PCA. The results showed that the use of both LDA and Fisher methods in the GANs can filter the features in smaller dimensions while preserving the emotional information for classification. The results were compared with recent research and the proposed method was able to achieve 86.32% accuracy in the EMO-DB database. |
first_indexed | 2024-03-07T22:04:54Z |
format | Article |
id | doaj.art-11fd946107e14acf8fbbe6b7a19630c8 |
institution | Directory Open Access Journal |
issn | 2008-4854 2783-2538 |
language | fas |
last_indexed | 2024-03-07T22:04:54Z |
publishDate | 2023-03-01 |
publisher | Semnan University |
record_format | Article |
series | مجله مدل سازی در مهندسی |
spelling | doaj.art-11fd946107e14acf8fbbe6b7a19630c82024-02-23T19:10:30ZfasSemnan Universityمجله مدل سازی در مهندسی2008-48542783-25382023-03-01217211710.22075/jme.2022.24865.21597526Data Augmentation and Effective Feature Selection in Generative Adversarial Networks for Speech Emotion RecognitionArash Shilandari0Hossein Marvi1Hossein Khosravi2Faculty of Electrical Engineering, Shahrood University of Technology, Shahrood, IranFaculty of Electrical Engineering, Shahrood University of Technology, Shahrood, IranFaculty of Electrical Engineering, Shahrood University of Technology, Shahrood, IranUntil now, there has been no certainty based on the success or failure of using feature selection methods to increase the efficiency of SER systems. This article discusses feature selection for data augmentation in a speech emotion recognition system. The experiments were performed on four databases: EMO-DB, eNTERFACE05, SAVEE, and IEMOCAP. Simulations are performed in Python software and in addition, data analysis was performed on all four databases for four emotions of sadness, anger, happiness, and neutral. This paper discusses feature selection intending to create a GAN to augment data in a speech emotion recognition system. It will demonstrate that artificial data generated by GANs can not only augment data but also can be used to feature selection to improve classification performance. We used a GAN to augment data and selected two feature-selective networks including Fisher and LDA algorithm in two steps. SVM was also used to classify emotions. With the feedback taken from the classification network, we could bring the SER system to the optimal point of sample number and feature vector dimensions. The PCA is more effective on correlated data. The LDA algorithm works better on low-dimensional data. Fisher's method is better at reducing size than PCA. The results showed that the use of both LDA and Fisher methods in the GANs can filter the features in smaller dimensions while preserving the emotional information for classification. The results were compared with recent research and the proposed method was able to achieve 86.32% accuracy in the EMO-DB database.https://modelling.semnan.ac.ir/article_7526_d6a743149597c221d42f4f8dbf3521ea.pdfspeech emotion recognitionspeech feature selectiondata augmentationgenerative adversarial networks |
spellingShingle | Arash Shilandari Hossein Marvi Hossein Khosravi Data Augmentation and Effective Feature Selection in Generative Adversarial Networks for Speech Emotion Recognition مجله مدل سازی در مهندسی speech emotion recognition speech feature selection data augmentation generative adversarial networks |
title | Data Augmentation and Effective Feature Selection in Generative Adversarial Networks for Speech Emotion Recognition |
title_full | Data Augmentation and Effective Feature Selection in Generative Adversarial Networks for Speech Emotion Recognition |
title_fullStr | Data Augmentation and Effective Feature Selection in Generative Adversarial Networks for Speech Emotion Recognition |
title_full_unstemmed | Data Augmentation and Effective Feature Selection in Generative Adversarial Networks for Speech Emotion Recognition |
title_short | Data Augmentation and Effective Feature Selection in Generative Adversarial Networks for Speech Emotion Recognition |
title_sort | data augmentation and effective feature selection in generative adversarial networks for speech emotion recognition |
topic | speech emotion recognition speech feature selection data augmentation generative adversarial networks |
url | https://modelling.semnan.ac.ir/article_7526_d6a743149597c221d42f4f8dbf3521ea.pdf |
work_keys_str_mv | AT arashshilandari dataaugmentationandeffectivefeatureselectioningenerativeadversarialnetworksforspeechemotionrecognition AT hosseinmarvi dataaugmentationandeffectivefeatureselectioningenerativeadversarialnetworksforspeechemotionrecognition AT hosseinkhosravi dataaugmentationandeffectivefeatureselectioningenerativeadversarialnetworksforspeechemotionrecognition |