Construction of an Online Cloud Platform for Zhuang Speech Recognition and Translation with Edge-Computing-Based Deep Learning Algorithm
The Zhuang ethnic minority in China possesses its own ethnic language and no ethnic script. Cultural exchange and transmission encounter hurdles as the Zhuang rely exclusively on oral communication. An online cloud-based platform was required to enhance linguistic communication. First, a database of...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-11-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/13/22/12184 |
_version_ | 1797460355675324416 |
---|---|
author | Zeping Fan Min Huang Xuejun Zhang Rongqi Liu Xinyi Lyu Taisen Duan Zhaohui Bu Jianghua Liang |
author_facet | Zeping Fan Min Huang Xuejun Zhang Rongqi Liu Xinyi Lyu Taisen Duan Zhaohui Bu Jianghua Liang |
author_sort | Zeping Fan |
collection | DOAJ |
description | The Zhuang ethnic minority in China possesses its own ethnic language and no ethnic script. Cultural exchange and transmission encounter hurdles as the Zhuang rely exclusively on oral communication. An online cloud-based platform was required to enhance linguistic communication. First, a database of 200 h of annotated Zhuang speech was created by collecting standard Zhuang speeches and improving database quality by removing transcription inconsistencies and text normalization. Second, SAformerNet, a more efficient and accurate transformer-based automatic speech recognition (ASR) network, is achieved by inserting additional downsampling modules. Subsequently, a Neural Machine Translation (NMT) model for translating Zhuang into other languages is constructed by fine-tuning the BART model and corpus filtering strategy. Finally, for the network’s responsiveness to real-world needs, edge-computing techniques are applied to relieve network bandwidth pressure. An edge-computing private cloud system based on FPGA acceleration is proposed to improve model operation efficiency. Experiments show that the most critical metric of the system, model accuracy, is above 93%, and inference time is reduced by 29%. The computational delay for multi-head self-attention (MHSA) and feed-forward network (FFN) modules has been reduced by 7.1 and 1.9 times, respectively, and terminal response time is accelerated by 20% on average. Generally, the scheme provides a prototype tool for small-scale Zhuang remote natural language tasks in mountainous areas. |
first_indexed | 2024-03-09T17:03:52Z |
format | Article |
id | doaj.art-c6bb993d727d47d2a9e1ded0af3ce2e4 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-09T17:03:52Z |
publishDate | 2023-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-c6bb993d727d47d2a9e1ded0af3ce2e42023-11-24T14:26:32ZengMDPI AGApplied Sciences2076-34172023-11-0113221218410.3390/app132212184Construction of an Online Cloud Platform for Zhuang Speech Recognition and Translation with Edge-Computing-Based Deep Learning AlgorithmZeping Fan0Min Huang1Xuejun Zhang2Rongqi Liu3Xinyi Lyu4Taisen Duan5Zhaohui Bu6Jianghua Liang7School of Computer, Electronics and Information, Guangxi University, Nanning 530004, ChinaSchool of Computer, Electronics and Information, Guangxi University, Nanning 530004, ChinaSchool of Computer, Electronics and Information, Guangxi University, Nanning 530004, ChinaSchool of Computer, Electronics and Information, Guangxi University, Nanning 530004, ChinaSchool of Computer, Electronics and Information, Guangxi University, Nanning 530004, ChinaSchool of Computer, Electronics and Information, Guangxi University, Nanning 530004, ChinaSchool of Foreign Language, Guangxi University, Nanning 530004, ChinaSchool of Journalism and Communication, Guangxi University, Nanning 530004, ChinaThe Zhuang ethnic minority in China possesses its own ethnic language and no ethnic script. Cultural exchange and transmission encounter hurdles as the Zhuang rely exclusively on oral communication. An online cloud-based platform was required to enhance linguistic communication. First, a database of 200 h of annotated Zhuang speech was created by collecting standard Zhuang speeches and improving database quality by removing transcription inconsistencies and text normalization. Second, SAformerNet, a more efficient and accurate transformer-based automatic speech recognition (ASR) network, is achieved by inserting additional downsampling modules. Subsequently, a Neural Machine Translation (NMT) model for translating Zhuang into other languages is constructed by fine-tuning the BART model and corpus filtering strategy. Finally, for the network’s responsiveness to real-world needs, edge-computing techniques are applied to relieve network bandwidth pressure. An edge-computing private cloud system based on FPGA acceleration is proposed to improve model operation efficiency. Experiments show that the most critical metric of the system, model accuracy, is above 93%, and inference time is reduced by 29%. The computational delay for multi-head self-attention (MHSA) and feed-forward network (FFN) modules has been reduced by 7.1 and 1.9 times, respectively, and terminal response time is accelerated by 20% on average. Generally, the scheme provides a prototype tool for small-scale Zhuang remote natural language tasks in mountainous areas.https://www.mdpi.com/2076-3417/13/22/12184automatic speech recognitionnatural language processingneural machine translationtransformercloud edge computingnetwork programming |
spellingShingle | Zeping Fan Min Huang Xuejun Zhang Rongqi Liu Xinyi Lyu Taisen Duan Zhaohui Bu Jianghua Liang Construction of an Online Cloud Platform for Zhuang Speech Recognition and Translation with Edge-Computing-Based Deep Learning Algorithm Applied Sciences automatic speech recognition natural language processing neural machine translation transformer cloud edge computing network programming |
title | Construction of an Online Cloud Platform for Zhuang Speech Recognition and Translation with Edge-Computing-Based Deep Learning Algorithm |
title_full | Construction of an Online Cloud Platform for Zhuang Speech Recognition and Translation with Edge-Computing-Based Deep Learning Algorithm |
title_fullStr | Construction of an Online Cloud Platform for Zhuang Speech Recognition and Translation with Edge-Computing-Based Deep Learning Algorithm |
title_full_unstemmed | Construction of an Online Cloud Platform for Zhuang Speech Recognition and Translation with Edge-Computing-Based Deep Learning Algorithm |
title_short | Construction of an Online Cloud Platform for Zhuang Speech Recognition and Translation with Edge-Computing-Based Deep Learning Algorithm |
title_sort | construction of an online cloud platform for zhuang speech recognition and translation with edge computing based deep learning algorithm |
topic | automatic speech recognition natural language processing neural machine translation transformer cloud edge computing network programming |
url | https://www.mdpi.com/2076-3417/13/22/12184 |
work_keys_str_mv | AT zepingfan constructionofanonlinecloudplatformforzhuangspeechrecognitionandtranslationwithedgecomputingbaseddeeplearningalgorithm AT minhuang constructionofanonlinecloudplatformforzhuangspeechrecognitionandtranslationwithedgecomputingbaseddeeplearningalgorithm AT xuejunzhang constructionofanonlinecloudplatformforzhuangspeechrecognitionandtranslationwithedgecomputingbaseddeeplearningalgorithm AT rongqiliu constructionofanonlinecloudplatformforzhuangspeechrecognitionandtranslationwithedgecomputingbaseddeeplearningalgorithm AT xinyilyu constructionofanonlinecloudplatformforzhuangspeechrecognitionandtranslationwithedgecomputingbaseddeeplearningalgorithm AT taisenduan constructionofanonlinecloudplatformforzhuangspeechrecognitionandtranslationwithedgecomputingbaseddeeplearningalgorithm AT zhaohuibu constructionofanonlinecloudplatformforzhuangspeechrecognitionandtranslationwithedgecomputingbaseddeeplearningalgorithm AT jianghualiang constructionofanonlinecloudplatformforzhuangspeechrecognitionandtranslationwithedgecomputingbaseddeeplearningalgorithm |