Classification of emotional speech using spectral pattern features

Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmo...

Full description

Bibliographic Details
Main Authors: Ali Harimi, Ali Shahzadi, Alireza Ahmadyfard, Khashayar Yaghmaie
Format: Article
Language:English
Published: Shahrood University of Technology 2014-06-01
Series:Journal of Artificial Intelligence and Data Mining
Subjects:
Online Access:http://jad.shahroodut.ac.ir/article_150_91f7a51f3a7917443555dfe0b2992b62.pdf
_version_ 1819210560594509824
author Ali Harimi
Ali Shahzadi
Alireza Ahmadyfard
Khashayar Yaghmaie
author_facet Ali Harimi
Ali Shahzadi
Alireza Ahmadyfard
Khashayar Yaghmaie
author_sort Ali Harimi
collection DOAJ
description Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram of speech signal using image processing techniques. For this purpose, details in the spectrogram image are firstly highlighted using histogram equalization technique. Then, directional filters are applied to decompose the image into 6 directional components. Finally, binary masking approach is employed to extract SPs from sub-banded images. The proposed HEs are also extracted by implementing the band pass filters on the spectrogram image. The extracted features are reduced in dimensions using a filtering feature selection algorithm based on fisher discriminant ratio. The classification accuracy of the pro-posed SER system has been evaluated using the 10-fold cross-validation technique on the Berlin database. The average recognition rate of 88.37% and 85.04% were achieved for females and males, respectively. By considering the total number of males and females samples, the overall recognition rate of 86.91% was obtained.
first_indexed 2024-12-23T06:13:07Z
format Article
id doaj.art-01edf57fe3b143d7a97b60f8beeec371
institution Directory Open Access Journal
issn 2322-5211
2322-4444
language English
last_indexed 2024-12-23T06:13:07Z
publishDate 2014-06-01
publisher Shahrood University of Technology
record_format Article
series Journal of Artificial Intelligence and Data Mining
spelling doaj.art-01edf57fe3b143d7a97b60f8beeec3712022-12-21T17:57:24ZengShahrood University of TechnologyJournal of Artificial Intelligence and Data Mining2322-52112322-44442014-06-0121536110.22044/jadm.2014.150150Classification of emotional speech using spectral pattern featuresAli Harimi0Ali Shahzadi1Alireza Ahmadyfard2Khashayar Yaghmaie3Faculty of Electrical & Computer Engineering, Semnan UniversityFaculty of Electrical & Computer Engineering, Semnan UniversityDepartment of Electrical Engineering and Robotics, Shahrood University of technologyFaculty of Electrical & Computer Engineering, Semnan UniversitySpeech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram of speech signal using image processing techniques. For this purpose, details in the spectrogram image are firstly highlighted using histogram equalization technique. Then, directional filters are applied to decompose the image into 6 directional components. Finally, binary masking approach is employed to extract SPs from sub-banded images. The proposed HEs are also extracted by implementing the band pass filters on the spectrogram image. The extracted features are reduced in dimensions using a filtering feature selection algorithm based on fisher discriminant ratio. The classification accuracy of the pro-posed SER system has been evaluated using the 10-fold cross-validation technique on the Berlin database. The average recognition rate of 88.37% and 85.04% were achieved for females and males, respectively. By considering the total number of males and females samples, the overall recognition rate of 86.91% was obtained.http://jad.shahroodut.ac.ir/article_150_91f7a51f3a7917443555dfe0b2992b62.pdfSpeech emotion recognitionspectral pattern featuresharmonic energy featurescross validation
spellingShingle Ali Harimi
Ali Shahzadi
Alireza Ahmadyfard
Khashayar Yaghmaie
Classification of emotional speech using spectral pattern features
Journal of Artificial Intelligence and Data Mining
Speech emotion recognition
spectral pattern features
harmonic energy features
cross validation
title Classification of emotional speech using spectral pattern features
title_full Classification of emotional speech using spectral pattern features
title_fullStr Classification of emotional speech using spectral pattern features
title_full_unstemmed Classification of emotional speech using spectral pattern features
title_short Classification of emotional speech using spectral pattern features
title_sort classification of emotional speech using spectral pattern features
topic Speech emotion recognition
spectral pattern features
harmonic energy features
cross validation
url http://jad.shahroodut.ac.ir/article_150_91f7a51f3a7917443555dfe0b2992b62.pdf
work_keys_str_mv AT aliharimi classificationofemotionalspeechusingspectralpatternfeatures
AT alishahzadi classificationofemotionalspeechusingspectralpatternfeatures
AT alirezaahmadyfard classificationofemotionalspeechusingspectralpatternfeatures
AT khashayaryaghmaie classificationofemotionalspeechusingspectralpatternfeatures