Classification of emotional speech using spectral pattern features

Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmo...

Full description

Bibliographic Details
Main Authors:	Ali Harimi, Ali Shahzadi, Alireza Ahmadyfard, Khashayar Yaghmaie
Format:	Article
Language:	English
Published:	Shahrood University of Technology 2014-06-01
Series:	Journal of Artificial Intelligence and Data Mining
Subjects:	Speech emotion recognition spectral pattern features harmonic energy features cross validation
Online Access:	http://jad.shahroodut.ac.ir/article_150_91f7a51f3a7917443555dfe0b2992b62.pdf

_version_	1819210560594509824
author	Ali Harimi Ali Shahzadi Alireza Ahmadyfard Khashayar Yaghmaie
author_facet	Ali Harimi Ali Shahzadi Alireza Ahmadyfard Khashayar Yaghmaie
author_sort	Ali Harimi
collection	DOAJ
description	Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram of speech signal using image processing techniques. For this purpose, details in the spectrogram image are firstly highlighted using histogram equalization technique. Then, directional filters are applied to decompose the image into 6 directional components. Finally, binary masking approach is employed to extract SPs from sub-banded images. The proposed HEs are also extracted by implementing the band pass filters on the spectrogram image. The extracted features are reduced in dimensions using a filtering feature selection algorithm based on fisher discriminant ratio. The classification accuracy of the pro-posed SER system has been evaluated using the 10-fold cross-validation technique on the Berlin database. The average recognition rate of 88.37% and 85.04% were achieved for females and males, respectively. By considering the total number of males and females samples, the overall recognition rate of 86.91% was obtained.
first_indexed	2024-12-23T06:13:07Z
format	Article
id	doaj.art-01edf57fe3b143d7a97b60f8beeec371
institution	Directory Open Access Journal
issn	2322-5211 2322-4444
language	English
last_indexed	2024-12-23T06:13:07Z
publishDate	2014-06-01
publisher	Shahrood University of Technology
record_format	Article
series	Journal of Artificial Intelligence and Data Mining
spelling	doaj.art-01edf57fe3b143d7a97b60f8beeec3712022-12-21T17:57:24ZengShahrood University of TechnologyJournal of Artificial Intelligence and Data Mining2322-52112322-44442014-06-0121536110.22044/jadm.2014.150150Classification of emotional speech using spectral pattern featuresAli Harimi0Ali Shahzadi1Alireza Ahmadyfard2Khashayar Yaghmaie3Faculty of Electrical & Computer Engineering, Semnan UniversityFaculty of Electrical & Computer Engineering, Semnan UniversityDepartment of Electrical Engineering and Robotics, Shahrood University of technologyFaculty of Electrical & Computer Engineering, Semnan UniversitySpeech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram of speech signal using image processing techniques. For this purpose, details in the spectrogram image are firstly highlighted using histogram equalization technique. Then, directional filters are applied to decompose the image into 6 directional components. Finally, binary masking approach is employed to extract SPs from sub-banded images. The proposed HEs are also extracted by implementing the band pass filters on the spectrogram image. The extracted features are reduced in dimensions using a filtering feature selection algorithm based on fisher discriminant ratio. The classification accuracy of the pro-posed SER system has been evaluated using the 10-fold cross-validation technique on the Berlin database. The average recognition rate of 88.37% and 85.04% were achieved for females and males, respectively. By considering the total number of males and females samples, the overall recognition rate of 86.91% was obtained.http://jad.shahroodut.ac.ir/article_150_91f7a51f3a7917443555dfe0b2992b62.pdfSpeech emotion recognitionspectral pattern featuresharmonic energy featurescross validation
spellingShingle	Ali Harimi Ali Shahzadi Alireza Ahmadyfard Khashayar Yaghmaie Classification of emotional speech using spectral pattern features Journal of Artificial Intelligence and Data Mining Speech emotion recognition spectral pattern features harmonic energy features cross validation
title	Classification of emotional speech using spectral pattern features
title_full	Classification of emotional speech using spectral pattern features
title_fullStr	Classification of emotional speech using spectral pattern features
title_full_unstemmed	Classification of emotional speech using spectral pattern features
title_short	Classification of emotional speech using spectral pattern features
title_sort	classification of emotional speech using spectral pattern features
topic	Speech emotion recognition spectral pattern features harmonic energy features cross validation
url	http://jad.shahroodut.ac.ir/article_150_91f7a51f3a7917443555dfe0b2992b62.pdf
work_keys_str_mv	AT aliharimi classificationofemotionalspeechusingspectralpatternfeatures AT alishahzadi classificationofemotionalspeechusingspectralpatternfeatures AT alirezaahmadyfard classificationofemotionalspeechusingspectralpatternfeatures AT khashayaryaghmaie classificationofemotionalspeechusingspectralpatternfeatures

Classification of emotional speech using spectral pattern features

Similar Items