WAVELET DETAIL COEFFICIENT AS A NOVEL WAVELET-MFCC FEATURES IN TEXT-DEPENDENT SPEAKER RECOGNITION SYSTEM

Speaker recognition is the process of recognizing a speaker from his speech. This can be used in many aspects of life, such as taking access remotely to a personal device, securing access to voice control, and doing a forensic investigation. In speaker recognition, extracting features from the speec...

Full description

Bibliographic Details
Main Authors: Syahroni Hidayat, Muhammad Tajuddin, Siti Agrippina Alodia Yusuf, Jihadil Qudsi, Nenet Natasudian Jaya
Format: Article
Language:English
Published: IIUM Press, International Islamic University Malaysia 2022-01-01
Series:International Islamic University Malaysia Engineering Journal
Subjects:
Online Access:https://journals.iium.edu.my/ejournal/index.php/iiumej/article/view/1760
_version_ 1819237337513590784
author Syahroni Hidayat
Muhammad Tajuddin
Siti Agrippina Alodia Yusuf
Jihadil Qudsi
Nenet Natasudian Jaya
author_facet Syahroni Hidayat
Muhammad Tajuddin
Siti Agrippina Alodia Yusuf
Jihadil Qudsi
Nenet Natasudian Jaya
author_sort Syahroni Hidayat
collection DOAJ
description Speaker recognition is the process of recognizing a speaker from his speech. This can be used in many aspects of life, such as taking access remotely to a personal device, securing access to voice control, and doing a forensic investigation. In speaker recognition, extracting features from the speech is the most critical process. The features are used to represent the speech as unique features to distinguish speech samples from one another. In this research, we proposed the use of a combination of Wavelet and Mel Frequency Cepstral Coefficient (MFCC), Wavelet-MFCC, as feature extraction methods, and Hidden Markov Model (HMM) as classification. The speech signal is first extracted using Wavelet into one level of decomposition, then only the sub-band detail coefficient is used as the feature for further extraction using MFCC. The modeled system was applied in 300 speech datasets of 30 speakers uttering “HADIR” in the Indonesian language. K-fold cross-validation is implemented with five folds. As much as 80% of the data were trained for each fold, while the rest was used as testing data. Based on the testing, the system's accuracy using the combination of Wavelet-MFCC obtained is 96.67%. ABSTRAK: Pengecaman penutur adalah proses mengenali penutur dari ucapannya yang dapat digunakan dalam banyak aspek kehidupan, seperti mengambil akses dari jauh ke peranti peribadi, mendapat kawalan ke atas akses suara, dan melakukan penyelidikan forensik. Ciri-ciri khas dari ucapan merupakan proses paling kritikal dalam pengecaman penutur. Ciri-ciri ini digunakan bagi mengenali ciri unik yang terdapat pada sesebuah ucapan dalam membezakan satu sama lain. Penyelidikan ini mencadangkan penggunaan kombinasi Wavelet dan Mel Frekuensi Pekali Cepstral (MFCC), Wavelet-MFCC, sebagai kaedah ekstrak ciri-ciri penutur, dan Model Markov Tersembunyi (HMM) sebagai pengelasan. Isyarat penuturan pada awalnya diekstrak menggunakan Wavelet menjadi satu tahap penguraian, kemudian hanya pekali perincian sub-jalur digunakan bagi pengekstrakan ciri-ciri berikutnya menggunakan MFCC. Model ini diterapkan kepada 300 kumpulan data ucapan daripada 30 penutur yang mengucapkan kata "HADIR" dalam bahasa Indonesia. Pengesahan silang K-lipat dilaksanakan dengan 5 lipatan. Sebanyak 80% data telah dilatih bagi setiap lipatan, sementara selebihnya digunakan sebagai data ujian. Berdasarkan ujian ini, ketepatan sistem yang menggunakan kombinasi Wavelet-MFCC memperolehi 96.67%.
first_indexed 2024-12-23T13:18:44Z
format Article
id doaj.art-2fa74cd33c604d5088d8475802ae0ff2
institution Directory Open Access Journal
issn 1511-788X
2289-7860
language English
last_indexed 2024-12-23T13:18:44Z
publishDate 2022-01-01
publisher IIUM Press, International Islamic University Malaysia
record_format Article
series International Islamic University Malaysia Engineering Journal
spelling doaj.art-2fa74cd33c604d5088d8475802ae0ff22022-12-21T17:45:31ZengIIUM Press, International Islamic University MalaysiaInternational Islamic University Malaysia Engineering Journal1511-788X2289-78602022-01-0123110.31436/iiumej.v23i1.1760WAVELET DETAIL COEFFICIENT AS A NOVEL WAVELET-MFCC FEATURES IN TEXT-DEPENDENT SPEAKER RECOGNITION SYSTEMSyahroni Hidayat0Muhammad Tajuddin1Siti Agrippina Alodia Yusuf2Jihadil Qudsi3Nenet Natasudian Jaya4University of MataramUniversitas BumigoraSekawan InstitutePoliteknik Medica Farma HusadaUniversitas Mahasaraswati MataramSpeaker recognition is the process of recognizing a speaker from his speech. This can be used in many aspects of life, such as taking access remotely to a personal device, securing access to voice control, and doing a forensic investigation. In speaker recognition, extracting features from the speech is the most critical process. The features are used to represent the speech as unique features to distinguish speech samples from one another. In this research, we proposed the use of a combination of Wavelet and Mel Frequency Cepstral Coefficient (MFCC), Wavelet-MFCC, as feature extraction methods, and Hidden Markov Model (HMM) as classification. The speech signal is first extracted using Wavelet into one level of decomposition, then only the sub-band detail coefficient is used as the feature for further extraction using MFCC. The modeled system was applied in 300 speech datasets of 30 speakers uttering “HADIR” in the Indonesian language. K-fold cross-validation is implemented with five folds. As much as 80% of the data were trained for each fold, while the rest was used as testing data. Based on the testing, the system's accuracy using the combination of Wavelet-MFCC obtained is 96.67%. ABSTRAK: Pengecaman penutur adalah proses mengenali penutur dari ucapannya yang dapat digunakan dalam banyak aspek kehidupan, seperti mengambil akses dari jauh ke peranti peribadi, mendapat kawalan ke atas akses suara, dan melakukan penyelidikan forensik. Ciri-ciri khas dari ucapan merupakan proses paling kritikal dalam pengecaman penutur. Ciri-ciri ini digunakan bagi mengenali ciri unik yang terdapat pada sesebuah ucapan dalam membezakan satu sama lain. Penyelidikan ini mencadangkan penggunaan kombinasi Wavelet dan Mel Frekuensi Pekali Cepstral (MFCC), Wavelet-MFCC, sebagai kaedah ekstrak ciri-ciri penutur, dan Model Markov Tersembunyi (HMM) sebagai pengelasan. Isyarat penuturan pada awalnya diekstrak menggunakan Wavelet menjadi satu tahap penguraian, kemudian hanya pekali perincian sub-jalur digunakan bagi pengekstrakan ciri-ciri berikutnya menggunakan MFCC. Model ini diterapkan kepada 300 kumpulan data ucapan daripada 30 penutur yang mengucapkan kata "HADIR" dalam bahasa Indonesia. Pengesahan silang K-lipat dilaksanakan dengan 5 lipatan. Sebanyak 80% data telah dilatih bagi setiap lipatan, sementara selebihnya digunakan sebagai data ujian. Berdasarkan ujian ini, ketepatan sistem yang menggunakan kombinasi Wavelet-MFCC memperolehi 96.67%.https://journals.iium.edu.my/ejournal/index.php/iiumej/article/view/1760Discrete wavelet transformsFeature extractionHidden Markov ModelsSpeaker recognitionWavelet coefficients
spellingShingle Syahroni Hidayat
Muhammad Tajuddin
Siti Agrippina Alodia Yusuf
Jihadil Qudsi
Nenet Natasudian Jaya
WAVELET DETAIL COEFFICIENT AS A NOVEL WAVELET-MFCC FEATURES IN TEXT-DEPENDENT SPEAKER RECOGNITION SYSTEM
International Islamic University Malaysia Engineering Journal
Discrete wavelet transforms
Feature extraction
Hidden Markov Models
Speaker recognition
Wavelet coefficients
title WAVELET DETAIL COEFFICIENT AS A NOVEL WAVELET-MFCC FEATURES IN TEXT-DEPENDENT SPEAKER RECOGNITION SYSTEM
title_full WAVELET DETAIL COEFFICIENT AS A NOVEL WAVELET-MFCC FEATURES IN TEXT-DEPENDENT SPEAKER RECOGNITION SYSTEM
title_fullStr WAVELET DETAIL COEFFICIENT AS A NOVEL WAVELET-MFCC FEATURES IN TEXT-DEPENDENT SPEAKER RECOGNITION SYSTEM
title_full_unstemmed WAVELET DETAIL COEFFICIENT AS A NOVEL WAVELET-MFCC FEATURES IN TEXT-DEPENDENT SPEAKER RECOGNITION SYSTEM
title_short WAVELET DETAIL COEFFICIENT AS A NOVEL WAVELET-MFCC FEATURES IN TEXT-DEPENDENT SPEAKER RECOGNITION SYSTEM
title_sort wavelet detail coefficient as a novel wavelet mfcc features in text dependent speaker recognition system
topic Discrete wavelet transforms
Feature extraction
Hidden Markov Models
Speaker recognition
Wavelet coefficients
url https://journals.iium.edu.my/ejournal/index.php/iiumej/article/view/1760
work_keys_str_mv AT syahronihidayat waveletdetailcoefficientasanovelwaveletmfccfeaturesintextdependentspeakerrecognitionsystem
AT muhammadtajuddin waveletdetailcoefficientasanovelwaveletmfccfeaturesintextdependentspeakerrecognitionsystem
AT sitiagrippinaalodiayusuf waveletdetailcoefficientasanovelwaveletmfccfeaturesintextdependentspeakerrecognitionsystem
AT jihadilqudsi waveletdetailcoefficientasanovelwaveletmfccfeaturesintextdependentspeakerrecognitionsystem
AT nenetnatasudianjaya waveletdetailcoefficientasanovelwaveletmfccfeaturesintextdependentspeakerrecognitionsystem