Data Augmentation and Effective Feature Selection in Generative Adversarial Networks for Speech Emotion Recognition

Until now, there has been no certainty based on the success or failure of using feature selection methods to increase the efficiency of SER systems. This article discusses feature selection for data augmentation in a speech emotion recognition system. The experiments were performed on four databases...

Full description

Bibliographic Details
Main Authors: Arash Shilandari, Hossein Marvi, Hossein Khosravi
Format: Article
Language:fas
Published: Semnan University 2023-03-01
Series:مجله مدل سازی در مهندسی
Subjects:
Online Access:https://modelling.semnan.ac.ir/article_7526_d6a743149597c221d42f4f8dbf3521ea.pdf
_version_ 1797296450053341184
author Arash Shilandari
Hossein Marvi
Hossein Khosravi
author_facet Arash Shilandari
Hossein Marvi
Hossein Khosravi
author_sort Arash Shilandari
collection DOAJ
description Until now, there has been no certainty based on the success or failure of using feature selection methods to increase the efficiency of SER systems. This article discusses feature selection for data augmentation in a speech emotion recognition system. The experiments were performed on four databases: EMO-DB, eNTERFACE05, SAVEE, and IEMOCAP. Simulations are performed in Python software and in addition, data analysis was performed on all four databases for four emotions of sadness, anger, happiness, and neutral. This paper discusses feature selection intending to create a GAN to augment data in a speech emotion recognition system. It will demonstrate that artificial data generated by GANs can not only augment data but also can be used to feature selection to improve classification performance. We used a GAN to augment data and selected two feature-selective networks including Fisher and LDA algorithm in two steps. SVM was also used to classify emotions. With the feedback taken from the classification network, we could bring the SER system to the optimal point of sample number and feature vector dimensions. The PCA is more effective on correlated data. The LDA algorithm works better on low-dimensional data. Fisher's method is better at reducing size than PCA. The results showed that the use of both LDA and Fisher methods in the GANs can filter the features in smaller dimensions while preserving the emotional information for classification. The results were compared with recent research and the proposed method was able to achieve 86.32% accuracy in the EMO-DB database.
first_indexed 2024-03-07T22:04:54Z
format Article
id doaj.art-11fd946107e14acf8fbbe6b7a19630c8
institution Directory Open Access Journal
issn 2008-4854
2783-2538
language fas
last_indexed 2024-03-07T22:04:54Z
publishDate 2023-03-01
publisher Semnan University
record_format Article
series مجله مدل سازی در مهندسی
spelling doaj.art-11fd946107e14acf8fbbe6b7a19630c82024-02-23T19:10:30ZfasSemnan Universityمجله مدل سازی در مهندسی2008-48542783-25382023-03-01217211710.22075/jme.2022.24865.21597526Data Augmentation and Effective Feature Selection in Generative Adversarial Networks for Speech Emotion RecognitionArash Shilandari0Hossein Marvi1Hossein Khosravi2Faculty of Electrical Engineering, Shahrood University of Technology, Shahrood, IranFaculty of Electrical Engineering, Shahrood University of Technology, Shahrood, IranFaculty of Electrical Engineering, Shahrood University of Technology, Shahrood, IranUntil now, there has been no certainty based on the success or failure of using feature selection methods to increase the efficiency of SER systems. This article discusses feature selection for data augmentation in a speech emotion recognition system. The experiments were performed on four databases: EMO-DB, eNTERFACE05, SAVEE, and IEMOCAP. Simulations are performed in Python software and in addition, data analysis was performed on all four databases for four emotions of sadness, anger, happiness, and neutral. This paper discusses feature selection intending to create a GAN to augment data in a speech emotion recognition system. It will demonstrate that artificial data generated by GANs can not only augment data but also can be used to feature selection to improve classification performance. We used a GAN to augment data and selected two feature-selective networks including Fisher and LDA algorithm in two steps. SVM was also used to classify emotions. With the feedback taken from the classification network, we could bring the SER system to the optimal point of sample number and feature vector dimensions. The PCA is more effective on correlated data. The LDA algorithm works better on low-dimensional data. Fisher's method is better at reducing size than PCA. The results showed that the use of both LDA and Fisher methods in the GANs can filter the features in smaller dimensions while preserving the emotional information for classification. The results were compared with recent research and the proposed method was able to achieve 86.32% accuracy in the EMO-DB database.https://modelling.semnan.ac.ir/article_7526_d6a743149597c221d42f4f8dbf3521ea.pdfspeech emotion recognitionspeech feature selectiondata augmentationgenerative adversarial networks
spellingShingle Arash Shilandari
Hossein Marvi
Hossein Khosravi
Data Augmentation and Effective Feature Selection in Generative Adversarial Networks for Speech Emotion Recognition
مجله مدل سازی در مهندسی
speech emotion recognition
speech feature selection
data augmentation
generative adversarial networks
title Data Augmentation and Effective Feature Selection in Generative Adversarial Networks for Speech Emotion Recognition
title_full Data Augmentation and Effective Feature Selection in Generative Adversarial Networks for Speech Emotion Recognition
title_fullStr Data Augmentation and Effective Feature Selection in Generative Adversarial Networks for Speech Emotion Recognition
title_full_unstemmed Data Augmentation and Effective Feature Selection in Generative Adversarial Networks for Speech Emotion Recognition
title_short Data Augmentation and Effective Feature Selection in Generative Adversarial Networks for Speech Emotion Recognition
title_sort data augmentation and effective feature selection in generative adversarial networks for speech emotion recognition
topic speech emotion recognition
speech feature selection
data augmentation
generative adversarial networks
url https://modelling.semnan.ac.ir/article_7526_d6a743149597c221d42f4f8dbf3521ea.pdf
work_keys_str_mv AT arashshilandari dataaugmentationandeffectivefeatureselectioningenerativeadversarialnetworksforspeechemotionrecognition
AT hosseinmarvi dataaugmentationandeffectivefeatureselectioningenerativeadversarialnetworksforspeechemotionrecognition
AT hosseinkhosravi dataaugmentationandeffectivefeatureselectioningenerativeadversarialnetworksforspeechemotionrecognition