Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network

Finer-grained decoding at a phoneme or syllable level is a key technology for continuous recognition of silent speech based on surface electromyogram (sEMG). This paper aims at developing a novel syllable-level decoding method for continuous silent speech recognition (SSR) using spatio-temporal end-...

Full description

Bibliographic Details
Main Authors:	Xi Chen, Xu Zhang, Xiang Chen, Xun Chen
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Transactions on Neural Systems and Rehabilitation Engineering
Subjects:	Silent speech recognition high-density surface electromyography spatiotemporal feature language model time sequence decoding
Online Access:	https://ieeexplore.ieee.org/document/10098814/

_version_	1797805166284505088
author	Xi Chen Xu Zhang Xiang Chen Xun Chen
author_facet	Xi Chen Xu Zhang Xiang Chen Xun Chen
author_sort	Xi Chen
collection	DOAJ
description	Finer-grained decoding at a phoneme or syllable level is a key technology for continuous recognition of silent speech based on surface electromyogram (sEMG). This paper aims at developing a novel syllable-level decoding method for continuous silent speech recognition (SSR) using spatio-temporal end-to-end neural network. In the proposed method, the high-density sEMG (HD-sEMG) was first converted into a series of feature images, and then a spatio-temporal end-to-end neural network was applied to extract discriminative feature representations and to achieve syllable-level decoding. The effectiveness of the proposed method was verified with HD-sEMG data recorded by four pieces of 64-channel electrode arrays placed over facial and laryngeal muscles of fifteen subjects subvocalizing 33 Chinese phrases consisting of 82 syllables. The proposed method outperformed the benchmark methods by achieving the highest phrase classification accuracy (97.17 ± 1.53%, <inline-formula> <tex-math notation="LaTeX">${p} < 0.05$ </tex-math></inline-formula>), and lower character error rate (3.11 ± 1.46%, <inline-formula> <tex-math notation="LaTeX">${p} < 0.05$ </tex-math></inline-formula>). This study provides a promising way of decoding sEMG towards SSR, which has great potential applications in instant communication and remote control.
first_indexed	2024-03-13T05:47:55Z
format	Article
id	doaj.art-8c5342e327894227a2c94f9d6f40f17c
institution	Directory Open Access Journal
issn	1558-0210
language	English
last_indexed	2024-03-13T05:47:55Z
publishDate	2023-01-01
publisher	IEEE
record_format	Article
series	IEEE Transactions on Neural Systems and Rehabilitation Engineering
spelling	doaj.art-8c5342e327894227a2c94f9d6f40f17c2023-06-13T20:07:37ZengIEEEIEEE Transactions on Neural Systems and Rehabilitation Engineering1558-02102023-01-01312069207810.1109/TNSRE.2023.326629910098814Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural NetworkXi Chen0https://orcid.org/0000-0001-7889-7151Xu Zhang1https://orcid.org/0000-0002-1533-4340Xiang Chen2https://orcid.org/0000-0001-8259-4815Xun Chen3https://orcid.org/0000-0002-4922-8116School of Information Science and Technology, University of Science and Technology of China, Anhui, Hefei, ChinaSchool of Information Science and Technology, University of Science and Technology of China, Anhui, Hefei, ChinaSchool of Information Science and Technology, University of Science and Technology of China, Anhui, Hefei, ChinaSchool of Information Science and Technology, University of Science and Technology of China, Anhui, Hefei, ChinaFiner-grained decoding at a phoneme or syllable level is a key technology for continuous recognition of silent speech based on surface electromyogram (sEMG). This paper aims at developing a novel syllable-level decoding method for continuous silent speech recognition (SSR) using spatio-temporal end-to-end neural network. In the proposed method, the high-density sEMG (HD-sEMG) was first converted into a series of feature images, and then a spatio-temporal end-to-end neural network was applied to extract discriminative feature representations and to achieve syllable-level decoding. The effectiveness of the proposed method was verified with HD-sEMG data recorded by four pieces of 64-channel electrode arrays placed over facial and laryngeal muscles of fifteen subjects subvocalizing 33 Chinese phrases consisting of 82 syllables. The proposed method outperformed the benchmark methods by achieving the highest phrase classification accuracy (97.17 ± 1.53%, <inline-formula> <tex-math notation="LaTeX">${p} < 0.05$ </tex-math></inline-formula>), and lower character error rate (3.11 ± 1.46%, <inline-formula> <tex-math notation="LaTeX">${p} < 0.05$ </tex-math></inline-formula>). This study provides a promising way of decoding sEMG towards SSR, which has great potential applications in instant communication and remote control.https://ieeexplore.ieee.org/document/10098814/Silent speech recognitionhigh-density surface electromyographyspatiotemporal featurelanguage modeltime sequence decoding
spellingShingle	Xi Chen Xu Zhang Xiang Chen Xun Chen Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network IEEE Transactions on Neural Systems and Rehabilitation Engineering Silent speech recognition high-density surface electromyography spatiotemporal feature language model time sequence decoding
title	Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network
title_full	Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network
title_fullStr	Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network
title_full_unstemmed	Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network
title_short	Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network
title_sort	decoding silent speech based on high density surface electromyogram using spatiotemporal neural network
topic	Silent speech recognition high-density surface electromyography spatiotemporal feature language model time sequence decoding
url	https://ieeexplore.ieee.org/document/10098814/
work_keys_str_mv	AT xichen decodingsilentspeechbasedonhighdensitysurfaceelectromyogramusingspatiotemporalneuralnetwork AT xuzhang decodingsilentspeechbasedonhighdensitysurfaceelectromyogramusingspatiotemporalneuralnetwork AT xiangchen decodingsilentspeechbasedonhighdensitysurfaceelectromyogramusingspatiotemporalneuralnetwork AT xunchen decodingsilentspeechbasedonhighdensitysurfaceelectromyogramusingspatiotemporalneuralnetwork

Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network

Similar Items