The Study of Malay’s Prosodic Features Impact on Classical Arabic Accents Recognition

Modeling individual’s variation in speech pattern can be challenging in Automatic Speech Recognition (ASR). In Classical Arabic (CA) language, 20 Quranic accents are permitted for Quranic recitation. An ASR system for CA with accent detection requires a modeling method that can capture sp...

Full description

Bibliographic Details
Main Authors: Noor Jamaliah Ibrahim, Mohd Yamani Idna Idris, M. Y. Zulkifli Mohd Yusoff, Roziana Ramli, Raja Jamilah Raja Yusof
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10196429/
Description
Summary:Modeling individual’s variation in speech pattern can be challenging in Automatic Speech Recognition (ASR). In Classical Arabic (CA) language, 20 Quranic accents are permitted for Quranic recitation. An ASR system for CA with accent detection requires a modeling method that can capture speech pattern changes. Here, we study the accentual influences on Malay speakers’ pronunciation and its prosodic impacts towards ASR system for CA language with seven Quranic accents identification. The proposed ASR system was developed over three stages. First, a dataset of Surah Al-Fatihah recitation was recorded from 14 Malay speakers in seven Quranic accents, forming a total of 5,684 words. Second, various spectral and prosodic features are extracted from the dataset for further classification process. The final stage includes training and testing the classification model. The existing ASR systems are often enabled by Gaussian Mixture Models (GMM) because of its capability to represent a wide range of sample distributions. However, GMM is susceptible to overfitting when the model complexity is high, due to the presence of singularities. To support identification of seven Quranic accents, Universal Background Model (UBM) is adapted to GMM using Maximum A Posteriori (MAP) estimation method. The UBM models were trained over each of Quranic accents, and combined to establish final UBM with 512 mixture components. The proposed ASR system utilizing the GMM-UBM outperformed k-NN, GMM, and GMM-iVector in identifying Al-Fatihah recitation to the corresponding Quranic accents. The GMM-UBM yields a testing accuracy of 86.148%, which is an increment of 4.435% from utilizing GMM alone.
ISSN:2169-3536