Deep Visual Attributes vs. Hand-Crafted Audio Features on Multidomain Speech Emotion Recognition

Deep Visual Attributes vs. Hand-Crafted Audio Features on Multidomain Speech Emotion Recognition

Emotion recognition from speech may play a crucial role in many applications related to human–computer interaction or understanding the affective state of users in certain tasks, where other modalities such as video or physiological parameters are unavailable. In general, a human’s emotions may be r...

Full description

Bibliographic Details
Main Authors:	Michalis Papakostas, Evaggelos Spyrou, Theodoros Giannakopoulos, Giorgos Siantikos, Dimitrios Sgouropoulos, Phivos Mylonas, Fillia Makedon
Format:	Article
Language:	English
Published:	MDPI AG 2017-06-01
Series:	Computation
Subjects:	emotion recognition convolutional neural networks spectrograms
Online Access:	http://www.mdpi.com/2079-3197/5/2/26

Similar Items

Emotion Recognition from Speech Using the Bag-of-Visual Words on Audio Segment Spectrograms
by: Evaggelos Spyrou, et al.
Published: (2019-02-01)

Introduction to the Special Issue on Image-Based Information Retrieval from the Web
by: Phivos Mylonas, et al.
Published: (2019-06-01)

Human Activity Recognition in the Presence of Occlusion
by: Ioannis Vernikos, et al.
Published: (2023-05-01)

CogBeacon: A Multi-Modal Dataset and Data-Collection Platform for Modeling Cognitive Fatigue
by: Michalis Papakostas, et al.
Published: (2019-06-01)

Audio-Based Event Detection at Different SNR Settings Using Two-Dimensional Spectrogram Magnitude Representations
by: Ioannis Papadimitriou, et al.
Published: (2020-09-01)

Speech Emotion Recognition Using a Dual-Channel Complementary Spectrogram and the CNN-SSAE Neutral Network
by: Juan Li, et al.
Published: (2022-09-01)

Penetration State Identification of Aluminum Alloy Cold Metal Transfer Based on Arc Sound Signals Using Multi-Spectrogram Fusion Inception Convolutional Neural Network
by: Guang Yang, et al.
Published: (2023-12-01)

Mel-MViTv2: Enhanced Speech Emotion Recognition With Mel Spectrogram and Improved Multiscale Vision Transformers
by: Kah Liang Ong, et al.
Published: (2023-01-01)

A CNN-Assisted Enhanced Audio Signal Processing for Speech Emotion Recognition
by: Mustaqeem, et al.
Published: (2019-12-01)

NISQE: Non-Intrusive Speech Quality Evaluator Based on Natural Statistics of Mean Subtracted Contrast Normalized Coefficients of Spectrogram
by: Shakeel Zafar, et al.
Published: (2023-06-01)

Disrupting Audio Event Detection Deep Neural Networks with White Noise
by: Rodrigo dos Santos, et al.
Published: (2021-09-01)

Features Speech Signature Image Recognition on Mobile Devices
by: Alexander Mikhailovich Alyushin, et al.
Published: (2015-12-01)

The Use of Speech Technology to Protect the Document Turnover
by: Alexandr M. Alyushin, et al.
Published: (2017-06-01)

An Introduction to Speech Sciences (Acoustic Analysis of Speech)
by: Nasser Rezaei, et al.
Published: (2006-09-01)

A Survey of Audio Enhancement Algorithms for Music, Speech, Bioacoustics, Biomedical, Industrial, and Environmental Sounds by Image U-Net
by: Sania Gul, et al.
Published: (2023-01-01)

A Social Environmental Sensor Network Integrated within a Web GIS Platform
by: Yorghos Voutos, et al.
Published: (2017-11-01)

Audio Time Stretching Using Fuzzy Classification of Spectral Bins
by: Eero-Pekka Damskägg, et al.
Published: (2017-12-01)

Predominant audio source separation in polyphonic music
by: Lekshmi Chandrika Reghunath, et al.
Published: (2023-11-01)

Semantic multimedia analysis and processing /
by: Spyrou, Evaggelos, editor, et al.
Published: (2014)

An AI-Enabled Bias-Free Respiratory Disease Diagnosis Model Using Cough Audio
by: Tabish Saeed, et al.
Published: (2024-01-01)

Automatic recognition and representation of text in the form of audio stream
by: L. V. Serebryanaya, et al.
Published: (2021-10-01)

On the Speech Properties and Feature Extraction Methods in Speech Emotion Recognition
by: Juraj Kacur, et al.
Published: (2021-03-01)

Masked Spectrogram Prediction for Unsupervised Domain Adaptation in Speech Enhancement
by: Katerina Zmolikova, et al.
Published: (2024-01-01)

Speech Emotion Recognition Based on Deep Residual Shrinkage Network
by: Tian Han, et al.
Published: (2023-06-01)

Language Identification-Based Evaluation of Single Channel Speech Separation of Overlapped Speeches
by: Zuhragvl Aysa, et al.
Published: (2022-10-01)

Data Augmentation vs. Domain Adaptation—A Case Study in Human Activity Recognition
by: Evaggelos Spyrou, et al.
Published: (2020-10-01)

An Investigation of ECAPA-TDNN Audio Type Recognition Method Based on Mel Acoustic Spectrograms
by: Jian Wang, et al.
Published: (2023-10-01)

nnAudio: An on-the-Fly GPU Audio to Spectrogram Conversion Toolbox Using 1D Convolutional Neural Networks
by: Kin Wai Cheuk, et al.
Published: (2020-01-01)

A Dual Stream Generative Adversarial Network with Phase Awareness for Speech Enhancement
by: Xintao Liang, et al.
Published: (2023-04-01)

Cyclicality in lending activity of Euro area in pre- and post- 2008 crisis: a local-adaptive-based testing of wavelets
by: Jitka Poměnková, et al.
Published: (2019-01-01)

Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural Networks
by: Roneel V. Sharan, et al.
Published: (2021-05-01)

Classification of audio signals using spectrogram surfaces and extrinsic distortion measures
by: Jeremy Levy, et al.
Published: (2022-10-01)

Speech Communication
by: Perkell, Joseph S., et al.
Published: (2010)

Speech Recognition of Accented Mandarin Based on Improved Conformer
by: Xing-Yao Yang, et al.
Published: (2023-04-01)

On the Effect of Log-Mel Spectrogram Parameter Tuning for Deep Learning-Based Speech Emotion Recognition
by: Azamat Mukhamediya, et al.
Published: (2023-01-01)

A New Network Structure for Speech Emotion Recognition Research
by: Chunsheng Xu, et al.
Published: (2024-02-01)

Enhancing Speech Emotion Recognition Using Dual Feature Extraction Encoders
by: Ilkhomjon Pulatov, et al.
Published: (2023-07-01)

The Feature Extraction Based on Texture Image Information for Emotion Sensing in Speech
by: Kun-Ching Wang
Published: (2014-09-01)

High gamma cortical processing of continuous speech in younger and older listeners
by: Joshua P. Kulasingham, et al.
Published: (2020-11-01)

Phonetic evaluation of the edentulous patient correlated with the various settings of the artificial teeth
by: Bortun Cristina, et al.
Published: (2004-01-01)