Development of a regional voice dataset and speaker classification based on machine learning

Abstract At present, voice biometrics are commonly used for identification and authentication of users through their voice. Voice based services such as mobile banking, access to personal devices, and logging into social networks are the common examples of authenticating users through voice biometri...

Full description

Bibliographic Details
Main Authors: Muhammad Ismail, Shahzad Memon, Lachhman Das Dhomeja, Shahid Munir Shah, Dostdar Hussain, Sabit Rahim, Imran Ali
Format: Article
Language:English
Published: SpringerOpen 2021-03-01
Series:Journal of Big Data
Subjects:
Online Access:https://doi.org/10.1186/s40537-021-00435-9
_version_ 1823927497917464576
author Muhammad Ismail
Shahzad Memon
Lachhman Das Dhomeja
Shahid Munir Shah
Dostdar Hussain
Sabit Rahim
Imran Ali
author_facet Muhammad Ismail
Shahzad Memon
Lachhman Das Dhomeja
Shahid Munir Shah
Dostdar Hussain
Sabit Rahim
Imran Ali
author_sort Muhammad Ismail
collection DOAJ
description Abstract At present, voice biometrics are commonly used for identification and authentication of users through their voice. Voice based services such as mobile banking, access to personal devices, and logging into social networks are the common examples of authenticating users through voice biometrics. In Pakistan, voice-based services are very common in banking and mobile/cellular sector, however, these services do not use voice features to recognize customers. Therefore, the chance to use these services with false identity is always high. It is essential to design a voice-based recognition system to minimize the risk of false identity. In this paper, we developed regional voice datasets for voice biometrics, by collecting voice data in different local accents of Pakistan. Although, there is a global need for voice biometrics especially when voice-based services are common, however, this paper uses Pakistan as a use case to show how to build regional voice dataset for voice biometrics. To build voice dataset, voice samples were recorded from 180 male and female speakers with two languages English and Urdu in form of five regional accents. Mel Frequency Cepstral Coefficient (MFCC) features were extracted from the collected voice samples to train Support Vector Machine (SVM), Artificial Neural Network (ANN), Random Forest (RF) and K-nearest neighbor (KNN) classifiers. The results indicate that ANN outperformed SVM, RF and KNN by achieving 88.53% and 86.58% recognition accuracy on both datasets respectively.
first_indexed 2024-12-16T20:40:06Z
format Article
id doaj.art-2a2e67eeb826450db58161e5908467c3
institution Directory Open Access Journal
issn 2196-1115
language English
last_indexed 2024-12-16T20:40:06Z
publishDate 2021-03-01
publisher SpringerOpen
record_format Article
series Journal of Big Data
spelling doaj.art-2a2e67eeb826450db58161e5908467c32022-12-21T22:17:07ZengSpringerOpenJournal of Big Data2196-11152021-03-018111810.1186/s40537-021-00435-9Development of a regional voice dataset and speaker classification based on machine learningMuhammad Ismail0Shahzad Memon1Lachhman Das Dhomeja2Shahid Munir Shah3Dostdar Hussain4Sabit Rahim5Imran Ali6AHS Bukhari Institute of Information and Communication Technology, Faculty of Engineering and Technology, University of SindhAHS Bukhari Institute of Information and Communication Technology, Faculty of Engineering and Technology, University of SindhAHS Bukhari Institute of Information and Communication Technology, Faculty of Engineering and Technology, University of SindhDepartment of Computer Science, Barrett Hodgson UniversityDepartment of Computer Science, Karakoram International UniversityDepartment of Computer Science, Karakoram International UniversityDepartment of Computer Science, Karakoram International UniversityAbstract At present, voice biometrics are commonly used for identification and authentication of users through their voice. Voice based services such as mobile banking, access to personal devices, and logging into social networks are the common examples of authenticating users through voice biometrics. In Pakistan, voice-based services are very common in banking and mobile/cellular sector, however, these services do not use voice features to recognize customers. Therefore, the chance to use these services with false identity is always high. It is essential to design a voice-based recognition system to minimize the risk of false identity. In this paper, we developed regional voice datasets for voice biometrics, by collecting voice data in different local accents of Pakistan. Although, there is a global need for voice biometrics especially when voice-based services are common, however, this paper uses Pakistan as a use case to show how to build regional voice dataset for voice biometrics. To build voice dataset, voice samples were recorded from 180 male and female speakers with two languages English and Urdu in form of five regional accents. Mel Frequency Cepstral Coefficient (MFCC) features were extracted from the collected voice samples to train Support Vector Machine (SVM), Artificial Neural Network (ANN), Random Forest (RF) and K-nearest neighbor (KNN) classifiers. The results indicate that ANN outperformed SVM, RF and KNN by achieving 88.53% and 86.58% recognition accuracy on both datasets respectively.https://doi.org/10.1186/s40537-021-00435-9Speaker recognition systemsSpeakers classificationVoice databaseAccents and dialects
spellingShingle Muhammad Ismail
Shahzad Memon
Lachhman Das Dhomeja
Shahid Munir Shah
Dostdar Hussain
Sabit Rahim
Imran Ali
Development of a regional voice dataset and speaker classification based on machine learning
Journal of Big Data
Speaker recognition systems
Speakers classification
Voice database
Accents and dialects
title Development of a regional voice dataset and speaker classification based on machine learning
title_full Development of a regional voice dataset and speaker classification based on machine learning
title_fullStr Development of a regional voice dataset and speaker classification based on machine learning
title_full_unstemmed Development of a regional voice dataset and speaker classification based on machine learning
title_short Development of a regional voice dataset and speaker classification based on machine learning
title_sort development of a regional voice dataset and speaker classification based on machine learning
topic Speaker recognition systems
Speakers classification
Voice database
Accents and dialects
url https://doi.org/10.1186/s40537-021-00435-9
work_keys_str_mv AT muhammadismail developmentofaregionalvoicedatasetandspeakerclassificationbasedonmachinelearning
AT shahzadmemon developmentofaregionalvoicedatasetandspeakerclassificationbasedonmachinelearning
AT lachhmandasdhomeja developmentofaregionalvoicedatasetandspeakerclassificationbasedonmachinelearning
AT shahidmunirshah developmentofaregionalvoicedatasetandspeakerclassificationbasedonmachinelearning
AT dostdarhussain developmentofaregionalvoicedatasetandspeakerclassificationbasedonmachinelearning
AT sabitrahim developmentofaregionalvoicedatasetandspeakerclassificationbasedonmachinelearning
AT imranali developmentofaregionalvoicedatasetandspeakerclassificationbasedonmachinelearning