Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition

Speech feature learning is the key to speech mental health recognition. Deep feature learning can automatically extract the speech features but suffers from the small sample problem. The traditional feature extract method is effective, but cannot find the inter-feature structure to generate the new...

Full description

Bibliographic Details
Main Authors: Hong Chen, Yuan Lin, Yongming Li, Wei Wang, Pin Wang, Yan Lei
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9348883/
_version_ 1831810010055704576
author Hong Chen
Yuan Lin
Yongming Li
Wei Wang
Pin Wang
Yan Lei
author_facet Hong Chen
Yuan Lin
Yongming Li
Wei Wang
Pin Wang
Yan Lei
author_sort Hong Chen
collection DOAJ
description Speech feature learning is the key to speech mental health recognition. Deep feature learning can automatically extract the speech features but suffers from the small sample problem. The traditional feature extract method is effective, but cannot find the inter-feature structure to generate the new high-quality features. This paper proposes an embedded hybrid feature deep sparse stacked autoencoder ensemble method to solve this problem. Firstly, the speech features are extracted based on prior knowledge and called original features. Secondly, the original features are embedded into the deep network (Sparse Stacked Autoencoder) to filter the output of the hidden layer, to enhance the complementarity between the deep features and the original features. Thirdly, the L1 regularized feature selection mechanism is designed to reduce the hybrid feature set formed by the combination of deep features and original features. Finally, a manifold projection classifier ensemble is designed to enhance the stability of classification. Besides, this paper for the first time proposes a speech collection scheme for mental health recognition. We construct a large-scale Chinese mental health speech database for verification of the proposed algorithm of mental health. In the experimental section, the proposed algorithm is verified and compared with the representative related algorithms. The experimental results show that the proposed algorithm has better classification accuracy than the other representative algorithms. The proposed method combines the advantages of deep feature learning and traditional feature extraction methods more efficiently to solve the small sample problem.
first_indexed 2024-12-22T20:51:22Z
format Article
id doaj.art-4e1029e37cad4c62884db95a4caa020b
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-22T20:51:22Z
publishDate 2021-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-4e1029e37cad4c62884db95a4caa020b2022-12-21T18:13:04ZengIEEEIEEE Access2169-35362021-01-019287292874110.1109/ACCESS.2021.30573829348883Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech RecognitionHong Chen0Yuan Lin1https://orcid.org/0000-0003-1564-5613Yongming Li2https://orcid.org/0000-0002-7542-4356Wei Wang3Pin Wang4https://orcid.org/0000-0002-4214-0488Yan Lei5Chongqing University Cancer Hospital, Chongqing, ChinaSchool of Microelectronics and Communication Engineering, Chongqing University, Chongqing, ChinaSchool of Microelectronics and Communication Engineering, Chongqing University, Chongqing, ChinaChongqing University Cancer Hospital, Chongqing, ChinaSchool of Microelectronics and Communication Engineering, Chongqing University, Chongqing, ChinaSchool of Microelectronics and Communication Engineering, Chongqing University, Chongqing, ChinaSpeech feature learning is the key to speech mental health recognition. Deep feature learning can automatically extract the speech features but suffers from the small sample problem. The traditional feature extract method is effective, but cannot find the inter-feature structure to generate the new high-quality features. This paper proposes an embedded hybrid feature deep sparse stacked autoencoder ensemble method to solve this problem. Firstly, the speech features are extracted based on prior knowledge and called original features. Secondly, the original features are embedded into the deep network (Sparse Stacked Autoencoder) to filter the output of the hidden layer, to enhance the complementarity between the deep features and the original features. Thirdly, the L1 regularized feature selection mechanism is designed to reduce the hybrid feature set formed by the combination of deep features and original features. Finally, a manifold projection classifier ensemble is designed to enhance the stability of classification. Besides, this paper for the first time proposes a speech collection scheme for mental health recognition. We construct a large-scale Chinese mental health speech database for verification of the proposed algorithm of mental health. In the experimental section, the proposed algorithm is verified and compared with the representative related algorithms. The experimental results show that the proposed algorithm has better classification accuracy than the other representative algorithms. The proposed method combines the advantages of deep feature learning and traditional feature extraction methods more efficiently to solve the small sample problem.https://ieeexplore.ieee.org/document/9348883/Embedded hybrid feature sparse stacked autoencoderensemble learningfeature fusionL1 regularizationspeech mental health recognition
spellingShingle Hong Chen
Yuan Lin
Yongming Li
Wei Wang
Pin Wang
Yan Lei
Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition
IEEE Access
Embedded hybrid feature sparse stacked autoencoder
ensemble learning
feature fusion
L1 regularization
speech mental health recognition
title Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition
title_full Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition
title_fullStr Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition
title_full_unstemmed Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition
title_short Hybrid Feature Embedded Sparse Stacked Autoencoder and Manifold Dimensionality Reduction Ensemble for Mental Health Speech Recognition
title_sort hybrid feature embedded sparse stacked autoencoder and manifold dimensionality reduction ensemble for mental health speech recognition
topic Embedded hybrid feature sparse stacked autoencoder
ensemble learning
feature fusion
L1 regularization
speech mental health recognition
url https://ieeexplore.ieee.org/document/9348883/
work_keys_str_mv AT hongchen hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition
AT yuanlin hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition
AT yongmingli hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition
AT weiwang hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition
AT pinwang hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition
AT yanlei hybridfeatureembeddedsparsestackedautoencoderandmanifolddimensionalityreductionensembleformentalhealthspeechrecognition