Speech recognition of south China languages based on federated learning and mathematical construction
As speech recognition technology continues to advance in sophistication and computer processing power, more and more recognition technologies are being integrated into a variety of software platforms, enabling intelligent speech processing. We create a comprehensive processing platform for multiling...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
AIMS Press
2023-07-01
|
Series: | Electronic Research Archive |
Subjects: | |
Online Access: | https://www.aimspress.com/article/doi/10.3934/era.2023255?viewType=HTML |
_version_ | 1797690942396825600 |
---|---|
author | Weiwei Lai Yinglong Zheng |
author_facet | Weiwei Lai Yinglong Zheng |
author_sort | Weiwei Lai |
collection | DOAJ |
description | As speech recognition technology continues to advance in sophistication and computer processing power, more and more recognition technologies are being integrated into a variety of software platforms, enabling intelligent speech processing. We create a comprehensive processing platform for multilingual resources used in business and security fields based on speech recognition and distributed processing technology. Based on the federated learning model, this study develops speech recognition and its mathematical model for languages in South China. It also creates a speech dataset for dialects in South China, which at present includes three dialects of Mandarin and Cantonese, Chaoshan and Hakka that are widely spoken in the Guangdong region. Additionally, it uses two data enhancement techniques—audio enhancement and spectrogram enhancement—for speech signal characteristics in order to address the issue of unequal label distribution in the dataset. With a macro-average F-value of 91.54% and when compared to earlier work in the field, experimental results show that this structure is combined with hyperbolic tangent activation function and spatial domain attention to propose a dialect classification model based on hybrid domain attention. |
first_indexed | 2024-03-12T02:07:22Z |
format | Article |
id | doaj.art-9a2da207abcb491ca447d293f5eb8ddf |
institution | Directory Open Access Journal |
issn | 2688-1594 |
language | English |
last_indexed | 2024-03-12T02:07:22Z |
publishDate | 2023-07-01 |
publisher | AIMS Press |
record_format | Article |
series | Electronic Research Archive |
spelling | doaj.art-9a2da207abcb491ca447d293f5eb8ddf2023-09-07T03:29:06ZengAIMS PressElectronic Research Archive2688-15942023-07-013184985500510.3934/era.2023255Speech recognition of south China languages based on federated learning and mathematical constructionWeiwei Lai 0Yinglong Zheng11. China Southern Power Grid Digital Enterprise Technology (Guangdong) Co., Ltd, Guangzhou 510000, Guangdong, China 2. Northwestern Polytechnical University, Xi'an, Shaanxi Province, China1. China Southern Power Grid Digital Enterprise Technology (Guangdong) Co., Ltd, Guangzhou 510000, Guangdong, China3. South China University of Technology, Guangzhou, Guangdong Province, ChinaAs speech recognition technology continues to advance in sophistication and computer processing power, more and more recognition technologies are being integrated into a variety of software platforms, enabling intelligent speech processing. We create a comprehensive processing platform for multilingual resources used in business and security fields based on speech recognition and distributed processing technology. Based on the federated learning model, this study develops speech recognition and its mathematical model for languages in South China. It also creates a speech dataset for dialects in South China, which at present includes three dialects of Mandarin and Cantonese, Chaoshan and Hakka that are widely spoken in the Guangdong region. Additionally, it uses two data enhancement techniques—audio enhancement and spectrogram enhancement—for speech signal characteristics in order to address the issue of unequal label distribution in the dataset. With a macro-average F-value of 91.54% and when compared to earlier work in the field, experimental results show that this structure is combined with hyperbolic tangent activation function and spatial domain attention to propose a dialect classification model based on hybrid domain attention.https://www.aimspress.com/article/doi/10.3934/era.2023255?viewType=HTMLfederated learningsouth chinalanguage speech recognitionmathematical model |
spellingShingle | Weiwei Lai Yinglong Zheng Speech recognition of south China languages based on federated learning and mathematical construction Electronic Research Archive federated learning south china language speech recognition mathematical model |
title | Speech recognition of south China languages based on federated learning and mathematical construction |
title_full | Speech recognition of south China languages based on federated learning and mathematical construction |
title_fullStr | Speech recognition of south China languages based on federated learning and mathematical construction |
title_full_unstemmed | Speech recognition of south China languages based on federated learning and mathematical construction |
title_short | Speech recognition of south China languages based on federated learning and mathematical construction |
title_sort | speech recognition of south china languages based on federated learning and mathematical construction |
topic | federated learning south china language speech recognition mathematical model |
url | https://www.aimspress.com/article/doi/10.3934/era.2023255?viewType=HTML |
work_keys_str_mv | AT weiweilai speechrecognitionofsouthchinalanguagesbasedonfederatedlearningandmathematicalconstruction AT yinglongzheng speechrecognitionofsouthchinalanguagesbasedonfederatedlearningandmathematicalconstruction |