Semi-Supervised Deep Time-Delay Embedded Clustering for Stress Speech Analysis

Real stressed speech is affected by various aspects (individual characteristics and environment) so that the stress patterns are diverse and different on each individual. To this end, in our previous work, we performed an unsupervised clustering method that able to self-learning manner by mapping th...

Full description

Bibliographic Details
Main Authors: Barlian Henryranu Prasetio, Hiroki Tamura, Koichi Tanno
Format: Article
Language:English
Published: MDPI AG 2019-11-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/8/11/1263
_version_ 1798034559269339136
author Barlian Henryranu Prasetio
Hiroki Tamura
Koichi Tanno
author_facet Barlian Henryranu Prasetio
Hiroki Tamura
Koichi Tanno
author_sort Barlian Henryranu Prasetio
collection DOAJ
description Real stressed speech is affected by various aspects (individual characteristics and environment) so that the stress patterns are diverse and different on each individual. To this end, in our previous work, we performed an unsupervised clustering method that able to self-learning manner by mapping the feature representations of the stress speech and clustering tasks simultaneously, called deep time-delay embedded clustering (DTEC). However, DTEC has not confirmed yet the compatibility between the output class and informational classes. Therefore, we proposed semi-supervised time-delay embedded clustering (SDTEC) as a new framework of semi-supervised in DTEC. SDTEC incorporates the prior information of pairwise constraints in the embedding layer and simultaneously learns the feature representation and the clustering assignments. The prior information was used to guide the clustering procedure so that the points that belong to the incorrect cluster can be corrected. The effectiveness of the proposed SDTEC was evaluated by comparing it with some baseline methods in terms of the clustering error rate (CER). Moreover, to demonstrate SDTEC’s capabilities, we conducted a comprehensive ablation study. Based on experiment results, SDTEC outperformed the baseline methods and achieves state-of-the-art results in semi-supervised clustering.
first_indexed 2024-04-11T20:45:54Z
format Article
id doaj.art-0714a7fd2fee4ef380c3f8234da4cdc6
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-04-11T20:45:54Z
publishDate 2019-11-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-0714a7fd2fee4ef380c3f8234da4cdc62022-12-22T04:04:02ZengMDPI AGElectronics2079-92922019-11-01811126310.3390/electronics8111263electronics8111263Semi-Supervised Deep Time-Delay Embedded Clustering for Stress Speech AnalysisBarlian Henryranu Prasetio0Hiroki Tamura1Koichi Tanno2Interdisciplinary Graduate School of Agriculture and Engineering, University of Miyazaki, Miyazaki 889-2192, JapanFaculty of Engineering, University of Miyazaki, Miyazaki 889-2192, JapanFaculty of Engineering, University of Miyazaki, Miyazaki 889-2192, JapanReal stressed speech is affected by various aspects (individual characteristics and environment) so that the stress patterns are diverse and different on each individual. To this end, in our previous work, we performed an unsupervised clustering method that able to self-learning manner by mapping the feature representations of the stress speech and clustering tasks simultaneously, called deep time-delay embedded clustering (DTEC). However, DTEC has not confirmed yet the compatibility between the output class and informational classes. Therefore, we proposed semi-supervised time-delay embedded clustering (SDTEC) as a new framework of semi-supervised in DTEC. SDTEC incorporates the prior information of pairwise constraints in the embedding layer and simultaneously learns the feature representation and the clustering assignments. The prior information was used to guide the clustering procedure so that the points that belong to the incorrect cluster can be corrected. The effectiveness of the proposed SDTEC was evaluated by comparing it with some baseline methods in terms of the clustering error rate (CER). Moreover, to demonstrate SDTEC’s capabilities, we conducted a comprehensive ablation study. Based on experiment results, SDTEC outperformed the baseline methods and achieves state-of-the-art results in semi-supervised clustering.https://www.mdpi.com/2079-9292/8/11/1263semi-supervisedclusteringstress speechdeep clusteringdnntdnnprior knowledgepairwise constraints
spellingShingle Barlian Henryranu Prasetio
Hiroki Tamura
Koichi Tanno
Semi-Supervised Deep Time-Delay Embedded Clustering for Stress Speech Analysis
Electronics
semi-supervised
clustering
stress speech
deep clustering
dnn
tdnn
prior knowledge
pairwise constraints
title Semi-Supervised Deep Time-Delay Embedded Clustering for Stress Speech Analysis
title_full Semi-Supervised Deep Time-Delay Embedded Clustering for Stress Speech Analysis
title_fullStr Semi-Supervised Deep Time-Delay Embedded Clustering for Stress Speech Analysis
title_full_unstemmed Semi-Supervised Deep Time-Delay Embedded Clustering for Stress Speech Analysis
title_short Semi-Supervised Deep Time-Delay Embedded Clustering for Stress Speech Analysis
title_sort semi supervised deep time delay embedded clustering for stress speech analysis
topic semi-supervised
clustering
stress speech
deep clustering
dnn
tdnn
prior knowledge
pairwise constraints
url https://www.mdpi.com/2079-9292/8/11/1263
work_keys_str_mv AT barlianhenryranuprasetio semisuperviseddeeptimedelayembeddedclusteringforstressspeechanalysis
AT hirokitamura semisuperviseddeeptimedelayembeddedclusteringforstressspeechanalysis
AT koichitanno semisuperviseddeeptimedelayembeddedclusteringforstressspeechanalysis