A RTL Implementation of Heterogeneous Machine Learning Network for French Computer Assisted Pronunciation Training

Computer-assisted pronunciation training (CAPT) is a helpful method for self-directed or long-distance foreign language learning. It greatly benefits from the progress, and of acoustic signal processing and artificial intelligence techniques. However, in real-life applications, embedded solutions ar...

Full description

Bibliographic Details
Main Authors: Yanjing Bi, Chao Li, Yannick Benezeth, Fan Yang
Format: Article
Language:English
Published: MDPI AG 2023-05-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/10/5835
Description
Summary:Computer-assisted pronunciation training (CAPT) is a helpful method for self-directed or long-distance foreign language learning. It greatly benefits from the progress, and of acoustic signal processing and artificial intelligence techniques. However, in real-life applications, embedded solutions are usually desired. This paper conceives a register-transfer level (RTL) core to facilitate the pronunciation diagnostic tasks by suppressing the mulitcollinearity of the speech waveforms. A recently proposed heterogeneous machine learning framework is selected as the French phoneme pronunciation diagnostic algorithm. This RTL core is implemented and optimized within a very-high-level synthesis method for fast prototyping. An original French phoneme data set containing 4830 samples is used for the evaluation experiments. The experiment results demonstrate that the proposed implementation reduces the diagnostic error rate by 0.79–1.33% compared to the state-of-the-art and achieves a speedup of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>10.89</mn><mo>×</mo></mrow></semantics></math></inline-formula> relative to its CPU implementation at the same abstract level of programming languages.
ISSN:2076-3417