WAVELET DETAIL COEFFICIENT AS A NOVEL WAVELET-MFCC FEATURES IN TEXT-DEPENDENT SPEAKER RECOGNITION SYSTEM
Speaker recognition is the process of recognizing a speaker from his speech. This can be used in many aspects of life, such as taking access remotely to a personal device, securing access to voice control, and doing a forensic investigation. In speaker recognition, extracting features from the speec...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IIUM Press, International Islamic University Malaysia
2022-01-01
|
Series: | International Islamic University Malaysia Engineering Journal |
Subjects: | |
Online Access: | https://journals.iium.edu.my/ejournal/index.php/iiumej/article/view/1760 |
_version_ | 1819237337513590784 |
---|---|
author | Syahroni Hidayat Muhammad Tajuddin Siti Agrippina Alodia Yusuf Jihadil Qudsi Nenet Natasudian Jaya |
author_facet | Syahroni Hidayat Muhammad Tajuddin Siti Agrippina Alodia Yusuf Jihadil Qudsi Nenet Natasudian Jaya |
author_sort | Syahroni Hidayat |
collection | DOAJ |
description | Speaker recognition is the process of recognizing a speaker from his speech. This can be used in many aspects of life, such as taking access remotely to a personal device, securing access to voice control, and doing a forensic investigation. In speaker recognition, extracting features from the speech is the most critical process. The features are used to represent the speech as unique features to distinguish speech samples from one another. In this research, we proposed the use of a combination of Wavelet and Mel Frequency Cepstral Coefficient (MFCC), Wavelet-MFCC, as feature extraction methods, and Hidden Markov Model (HMM) as classification. The speech signal is first extracted using Wavelet into one level of decomposition, then only the sub-band detail coefficient is used as the feature for further extraction using MFCC. The modeled system was applied in 300 speech datasets of 30 speakers uttering “HADIR” in the Indonesian language. K-fold cross-validation is implemented with five folds. As much as 80% of the data were trained for each fold, while the rest was used as testing data. Based on the testing, the system's accuracy using the combination of Wavelet-MFCC obtained is 96.67%.
ABSTRAK: Pengecaman penutur adalah proses mengenali penutur dari ucapannya yang dapat digunakan dalam banyak aspek kehidupan, seperti mengambil akses dari jauh ke peranti peribadi, mendapat kawalan ke atas akses suara, dan melakukan penyelidikan forensik. Ciri-ciri khas dari ucapan merupakan proses paling kritikal dalam pengecaman penutur. Ciri-ciri ini digunakan bagi mengenali ciri unik yang terdapat pada sesebuah ucapan dalam membezakan satu sama lain. Penyelidikan ini mencadangkan penggunaan kombinasi Wavelet dan Mel Frekuensi Pekali Cepstral (MFCC), Wavelet-MFCC, sebagai kaedah ekstrak ciri-ciri penutur, dan Model Markov Tersembunyi (HMM) sebagai pengelasan. Isyarat penuturan pada awalnya diekstrak menggunakan Wavelet menjadi satu tahap penguraian, kemudian hanya pekali perincian sub-jalur digunakan bagi pengekstrakan ciri-ciri berikutnya menggunakan MFCC. Model ini diterapkan kepada 300 kumpulan data ucapan daripada 30 penutur yang mengucapkan kata "HADIR" dalam bahasa Indonesia. Pengesahan silang K-lipat dilaksanakan dengan 5 lipatan. Sebanyak 80% data telah dilatih bagi setiap lipatan, sementara selebihnya digunakan sebagai data ujian. Berdasarkan ujian ini, ketepatan sistem yang menggunakan kombinasi Wavelet-MFCC memperolehi 96.67%. |
first_indexed | 2024-12-23T13:18:44Z |
format | Article |
id | doaj.art-2fa74cd33c604d5088d8475802ae0ff2 |
institution | Directory Open Access Journal |
issn | 1511-788X 2289-7860 |
language | English |
last_indexed | 2024-12-23T13:18:44Z |
publishDate | 2022-01-01 |
publisher | IIUM Press, International Islamic University Malaysia |
record_format | Article |
series | International Islamic University Malaysia Engineering Journal |
spelling | doaj.art-2fa74cd33c604d5088d8475802ae0ff22022-12-21T17:45:31ZengIIUM Press, International Islamic University MalaysiaInternational Islamic University Malaysia Engineering Journal1511-788X2289-78602022-01-0123110.31436/iiumej.v23i1.1760WAVELET DETAIL COEFFICIENT AS A NOVEL WAVELET-MFCC FEATURES IN TEXT-DEPENDENT SPEAKER RECOGNITION SYSTEMSyahroni Hidayat0Muhammad Tajuddin1Siti Agrippina Alodia Yusuf2Jihadil Qudsi3Nenet Natasudian Jaya4University of MataramUniversitas BumigoraSekawan InstitutePoliteknik Medica Farma HusadaUniversitas Mahasaraswati MataramSpeaker recognition is the process of recognizing a speaker from his speech. This can be used in many aspects of life, such as taking access remotely to a personal device, securing access to voice control, and doing a forensic investigation. In speaker recognition, extracting features from the speech is the most critical process. The features are used to represent the speech as unique features to distinguish speech samples from one another. In this research, we proposed the use of a combination of Wavelet and Mel Frequency Cepstral Coefficient (MFCC), Wavelet-MFCC, as feature extraction methods, and Hidden Markov Model (HMM) as classification. The speech signal is first extracted using Wavelet into one level of decomposition, then only the sub-band detail coefficient is used as the feature for further extraction using MFCC. The modeled system was applied in 300 speech datasets of 30 speakers uttering “HADIR” in the Indonesian language. K-fold cross-validation is implemented with five folds. As much as 80% of the data were trained for each fold, while the rest was used as testing data. Based on the testing, the system's accuracy using the combination of Wavelet-MFCC obtained is 96.67%. ABSTRAK: Pengecaman penutur adalah proses mengenali penutur dari ucapannya yang dapat digunakan dalam banyak aspek kehidupan, seperti mengambil akses dari jauh ke peranti peribadi, mendapat kawalan ke atas akses suara, dan melakukan penyelidikan forensik. Ciri-ciri khas dari ucapan merupakan proses paling kritikal dalam pengecaman penutur. Ciri-ciri ini digunakan bagi mengenali ciri unik yang terdapat pada sesebuah ucapan dalam membezakan satu sama lain. Penyelidikan ini mencadangkan penggunaan kombinasi Wavelet dan Mel Frekuensi Pekali Cepstral (MFCC), Wavelet-MFCC, sebagai kaedah ekstrak ciri-ciri penutur, dan Model Markov Tersembunyi (HMM) sebagai pengelasan. Isyarat penuturan pada awalnya diekstrak menggunakan Wavelet menjadi satu tahap penguraian, kemudian hanya pekali perincian sub-jalur digunakan bagi pengekstrakan ciri-ciri berikutnya menggunakan MFCC. Model ini diterapkan kepada 300 kumpulan data ucapan daripada 30 penutur yang mengucapkan kata "HADIR" dalam bahasa Indonesia. Pengesahan silang K-lipat dilaksanakan dengan 5 lipatan. Sebanyak 80% data telah dilatih bagi setiap lipatan, sementara selebihnya digunakan sebagai data ujian. Berdasarkan ujian ini, ketepatan sistem yang menggunakan kombinasi Wavelet-MFCC memperolehi 96.67%.https://journals.iium.edu.my/ejournal/index.php/iiumej/article/view/1760Discrete wavelet transformsFeature extractionHidden Markov ModelsSpeaker recognitionWavelet coefficients |
spellingShingle | Syahroni Hidayat Muhammad Tajuddin Siti Agrippina Alodia Yusuf Jihadil Qudsi Nenet Natasudian Jaya WAVELET DETAIL COEFFICIENT AS A NOVEL WAVELET-MFCC FEATURES IN TEXT-DEPENDENT SPEAKER RECOGNITION SYSTEM International Islamic University Malaysia Engineering Journal Discrete wavelet transforms Feature extraction Hidden Markov Models Speaker recognition Wavelet coefficients |
title | WAVELET DETAIL COEFFICIENT AS A NOVEL WAVELET-MFCC FEATURES IN TEXT-DEPENDENT SPEAKER RECOGNITION SYSTEM |
title_full | WAVELET DETAIL COEFFICIENT AS A NOVEL WAVELET-MFCC FEATURES IN TEXT-DEPENDENT SPEAKER RECOGNITION SYSTEM |
title_fullStr | WAVELET DETAIL COEFFICIENT AS A NOVEL WAVELET-MFCC FEATURES IN TEXT-DEPENDENT SPEAKER RECOGNITION SYSTEM |
title_full_unstemmed | WAVELET DETAIL COEFFICIENT AS A NOVEL WAVELET-MFCC FEATURES IN TEXT-DEPENDENT SPEAKER RECOGNITION SYSTEM |
title_short | WAVELET DETAIL COEFFICIENT AS A NOVEL WAVELET-MFCC FEATURES IN TEXT-DEPENDENT SPEAKER RECOGNITION SYSTEM |
title_sort | wavelet detail coefficient as a novel wavelet mfcc features in text dependent speaker recognition system |
topic | Discrete wavelet transforms Feature extraction Hidden Markov Models Speaker recognition Wavelet coefficients |
url | https://journals.iium.edu.my/ejournal/index.php/iiumej/article/view/1760 |
work_keys_str_mv | AT syahronihidayat waveletdetailcoefficientasanovelwaveletmfccfeaturesintextdependentspeakerrecognitionsystem AT muhammadtajuddin waveletdetailcoefficientasanovelwaveletmfccfeaturesintextdependentspeakerrecognitionsystem AT sitiagrippinaalodiayusuf waveletdetailcoefficientasanovelwaveletmfccfeaturesintextdependentspeakerrecognitionsystem AT jihadilqudsi waveletdetailcoefficientasanovelwaveletmfccfeaturesintextdependentspeakerrecognitionsystem AT nenetnatasudianjaya waveletdetailcoefficientasanovelwaveletmfccfeaturesintextdependentspeakerrecognitionsystem |