Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network
Finer-grained decoding at a phoneme or syllable level is a key technology for continuous recognition of silent speech based on surface electromyogram (sEMG). This paper aims at developing a novel syllable-level decoding method for continuous silent speech recognition (SSR) using spatio-temporal end-...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Transactions on Neural Systems and Rehabilitation Engineering |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10098814/ |
_version_ | 1797805166284505088 |
---|---|
author | Xi Chen Xu Zhang Xiang Chen Xun Chen |
author_facet | Xi Chen Xu Zhang Xiang Chen Xun Chen |
author_sort | Xi Chen |
collection | DOAJ |
description | Finer-grained decoding at a phoneme or syllable level is a key technology for continuous recognition of silent speech based on surface electromyogram (sEMG). This paper aims at developing a novel syllable-level decoding method for continuous silent speech recognition (SSR) using spatio-temporal end-to-end neural network. In the proposed method, the high-density sEMG (HD-sEMG) was first converted into a series of feature images, and then a spatio-temporal end-to-end neural network was applied to extract discriminative feature representations and to achieve syllable-level decoding. The effectiveness of the proposed method was verified with HD-sEMG data recorded by four pieces of 64-channel electrode arrays placed over facial and laryngeal muscles of fifteen subjects subvocalizing 33 Chinese phrases consisting of 82 syllables. The proposed method outperformed the benchmark methods by achieving the highest phrase classification accuracy (97.17 ± 1.53%, <inline-formula> <tex-math notation="LaTeX">${p} < 0.05$ </tex-math></inline-formula>), and lower character error rate (3.11 ± 1.46%, <inline-formula> <tex-math notation="LaTeX">${p} < 0.05$ </tex-math></inline-formula>). This study provides a promising way of decoding sEMG towards SSR, which has great potential applications in instant communication and remote control. |
first_indexed | 2024-03-13T05:47:55Z |
format | Article |
id | doaj.art-8c5342e327894227a2c94f9d6f40f17c |
institution | Directory Open Access Journal |
issn | 1558-0210 |
language | English |
last_indexed | 2024-03-13T05:47:55Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Transactions on Neural Systems and Rehabilitation Engineering |
spelling | doaj.art-8c5342e327894227a2c94f9d6f40f17c2023-06-13T20:07:37ZengIEEEIEEE Transactions on Neural Systems and Rehabilitation Engineering1558-02102023-01-01312069207810.1109/TNSRE.2023.326629910098814Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural NetworkXi Chen0https://orcid.org/0000-0001-7889-7151Xu Zhang1https://orcid.org/0000-0002-1533-4340Xiang Chen2https://orcid.org/0000-0001-8259-4815Xun Chen3https://orcid.org/0000-0002-4922-8116School of Information Science and Technology, University of Science and Technology of China, Anhui, Hefei, ChinaSchool of Information Science and Technology, University of Science and Technology of China, Anhui, Hefei, ChinaSchool of Information Science and Technology, University of Science and Technology of China, Anhui, Hefei, ChinaSchool of Information Science and Technology, University of Science and Technology of China, Anhui, Hefei, ChinaFiner-grained decoding at a phoneme or syllable level is a key technology for continuous recognition of silent speech based on surface electromyogram (sEMG). This paper aims at developing a novel syllable-level decoding method for continuous silent speech recognition (SSR) using spatio-temporal end-to-end neural network. In the proposed method, the high-density sEMG (HD-sEMG) was first converted into a series of feature images, and then a spatio-temporal end-to-end neural network was applied to extract discriminative feature representations and to achieve syllable-level decoding. The effectiveness of the proposed method was verified with HD-sEMG data recorded by four pieces of 64-channel electrode arrays placed over facial and laryngeal muscles of fifteen subjects subvocalizing 33 Chinese phrases consisting of 82 syllables. The proposed method outperformed the benchmark methods by achieving the highest phrase classification accuracy (97.17 ± 1.53%, <inline-formula> <tex-math notation="LaTeX">${p} < 0.05$ </tex-math></inline-formula>), and lower character error rate (3.11 ± 1.46%, <inline-formula> <tex-math notation="LaTeX">${p} < 0.05$ </tex-math></inline-formula>). This study provides a promising way of decoding sEMG towards SSR, which has great potential applications in instant communication and remote control.https://ieeexplore.ieee.org/document/10098814/Silent speech recognitionhigh-density surface electromyographyspatiotemporal featurelanguage modeltime sequence decoding |
spellingShingle | Xi Chen Xu Zhang Xiang Chen Xun Chen Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network IEEE Transactions on Neural Systems and Rehabilitation Engineering Silent speech recognition high-density surface electromyography spatiotemporal feature language model time sequence decoding |
title | Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network |
title_full | Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network |
title_fullStr | Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network |
title_full_unstemmed | Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network |
title_short | Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network |
title_sort | decoding silent speech based on high density surface electromyogram using spatiotemporal neural network |
topic | Silent speech recognition high-density surface electromyography spatiotemporal feature language model time sequence decoding |
url | https://ieeexplore.ieee.org/document/10098814/ |
work_keys_str_mv | AT xichen decodingsilentspeechbasedonhighdensitysurfaceelectromyogramusingspatiotemporalneuralnetwork AT xuzhang decodingsilentspeechbasedonhighdensitysurfaceelectromyogramusingspatiotemporalneuralnetwork AT xiangchen decodingsilentspeechbasedonhighdensitysurfaceelectromyogramusingspatiotemporalneuralnetwork AT xunchen decodingsilentspeechbasedonhighdensitysurfaceelectromyogramusingspatiotemporalneuralnetwork |