Classification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmo...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Shahrood University of Technology
2014-06-01
|
Series: | Journal of Artificial Intelligence and Data Mining |
Subjects: | |
Online Access: | http://jad.shahroodut.ac.ir/article_150_91f7a51f3a7917443555dfe0b2992b62.pdf |
_version_ | 1819210560594509824 |
---|---|
author | Ali Harimi Ali Shahzadi Alireza Ahmadyfard Khashayar Yaghmaie |
author_facet | Ali Harimi Ali Shahzadi Alireza Ahmadyfard Khashayar Yaghmaie |
author_sort | Ali Harimi |
collection | DOAJ |
description | Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram of speech signal using image processing techniques. For this purpose, details in the spectrogram image are firstly highlighted using histogram equalization technique. Then, directional filters are applied to decompose the image into 6 directional components. Finally, binary masking approach is employed to extract SPs from sub-banded images. The proposed HEs are also extracted by implementing the band pass filters on the spectrogram image. The extracted features are reduced in dimensions using a filtering feature selection algorithm based on fisher discriminant ratio. The classification accuracy of the pro-posed SER system has been evaluated using the 10-fold cross-validation technique on the Berlin database. The average recognition rate of 88.37% and 85.04% were achieved for females and males, respectively. By considering the total number of males and females samples, the overall recognition rate of 86.91% was obtained. |
first_indexed | 2024-12-23T06:13:07Z |
format | Article |
id | doaj.art-01edf57fe3b143d7a97b60f8beeec371 |
institution | Directory Open Access Journal |
issn | 2322-5211 2322-4444 |
language | English |
last_indexed | 2024-12-23T06:13:07Z |
publishDate | 2014-06-01 |
publisher | Shahrood University of Technology |
record_format | Article |
series | Journal of Artificial Intelligence and Data Mining |
spelling | doaj.art-01edf57fe3b143d7a97b60f8beeec3712022-12-21T17:57:24ZengShahrood University of TechnologyJournal of Artificial Intelligence and Data Mining2322-52112322-44442014-06-0121536110.22044/jadm.2014.150150Classification of emotional speech using spectral pattern featuresAli Harimi0Ali Shahzadi1Alireza Ahmadyfard2Khashayar Yaghmaie3Faculty of Electrical & Computer Engineering, Semnan UniversityFaculty of Electrical & Computer Engineering, Semnan UniversityDepartment of Electrical Engineering and Robotics, Shahrood University of technologyFaculty of Electrical & Computer Engineering, Semnan UniversitySpeech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram of speech signal using image processing techniques. For this purpose, details in the spectrogram image are firstly highlighted using histogram equalization technique. Then, directional filters are applied to decompose the image into 6 directional components. Finally, binary masking approach is employed to extract SPs from sub-banded images. The proposed HEs are also extracted by implementing the band pass filters on the spectrogram image. The extracted features are reduced in dimensions using a filtering feature selection algorithm based on fisher discriminant ratio. The classification accuracy of the pro-posed SER system has been evaluated using the 10-fold cross-validation technique on the Berlin database. The average recognition rate of 88.37% and 85.04% were achieved for females and males, respectively. By considering the total number of males and females samples, the overall recognition rate of 86.91% was obtained.http://jad.shahroodut.ac.ir/article_150_91f7a51f3a7917443555dfe0b2992b62.pdfSpeech emotion recognitionspectral pattern featuresharmonic energy featurescross validation |
spellingShingle | Ali Harimi Ali Shahzadi Alireza Ahmadyfard Khashayar Yaghmaie Classification of emotional speech using spectral pattern features Journal of Artificial Intelligence and Data Mining Speech emotion recognition spectral pattern features harmonic energy features cross validation |
title | Classification of emotional speech using spectral pattern features |
title_full | Classification of emotional speech using spectral pattern features |
title_fullStr | Classification of emotional speech using spectral pattern features |
title_full_unstemmed | Classification of emotional speech using spectral pattern features |
title_short | Classification of emotional speech using spectral pattern features |
title_sort | classification of emotional speech using spectral pattern features |
topic | Speech emotion recognition spectral pattern features harmonic energy features cross validation |
url | http://jad.shahroodut.ac.ir/article_150_91f7a51f3a7917443555dfe0b2992b62.pdf |
work_keys_str_mv | AT aliharimi classificationofemotionalspeechusingspectralpatternfeatures AT alishahzadi classificationofemotionalspeechusingspectralpatternfeatures AT alirezaahmadyfard classificationofemotionalspeechusingspectralpatternfeatures AT khashayaryaghmaie classificationofemotionalspeechusingspectralpatternfeatures |