ACG-EmoCluster: A Novel Framework to Capture Spatial and Temporal Information from Emotional Speech Enhanced by DeepCluster

Speech emotion recognition (SER) is a task that tailors a matching function between the speech features and the emotion labels. Speech data have higher information saturation than images and stronger temporal coherence than text. This makes entirely and effectively learning speech features challengi...

Full description

Bibliographic Details
Main Authors:	Huan Zhao, Lixuan Li, Xupeng Zha, Yujiang Wang, Zhaoxin Xie, Zixing Zhang
Format:	Article
Language:	English
Published:	MDPI AG 2023-05-01
Series:	Sensors
Subjects:	Attn–Convolution neural network Bidirectional Gated Recurrent Unit (BiGRU) semi-supervised learning (SSL) speech emotion recognition (SER)
Online Access:	https://www.mdpi.com/1424-8220/23/10/4777

Internet

https://www.mdpi.com/1424-8220/23/10/4777

ACG-EmoCluster: A Novel Framework to Capture Spatial and Temporal Information from Emotional Speech Enhanced by DeepCluster

Internet

Similar Items