Development of a regional voice dataset and speaker classification based on machine learning

Abstract At present, voice biometrics are commonly used for identification and authentication of users through their voice. Voice based services such as mobile banking, access to personal devices, and logging into social networks are the common examples of authenticating users through voice biometri...

Full description

Bibliographic Details
Main Authors:	Muhammad Ismail, Shahzad Memon, Lachhman Das Dhomeja, Shahid Munir Shah, Dostdar Hussain, Sabit Rahim, Imran Ali
Format:	Article
Language:	English
Published:	SpringerOpen 2021-03-01
Series:	Journal of Big Data
Subjects:	Speaker recognition systems Speakers classification Voice database Accents and dialects
Online Access:	https://doi.org/10.1186/s40537-021-00435-9

_version_	1823927497917464576
author	Muhammad Ismail Shahzad Memon Lachhman Das Dhomeja Shahid Munir Shah Dostdar Hussain Sabit Rahim Imran Ali
author_facet	Muhammad Ismail Shahzad Memon Lachhman Das Dhomeja Shahid Munir Shah Dostdar Hussain Sabit Rahim Imran Ali
author_sort	Muhammad Ismail
collection	DOAJ
description	Abstract At present, voice biometrics are commonly used for identification and authentication of users through their voice. Voice based services such as mobile banking, access to personal devices, and logging into social networks are the common examples of authenticating users through voice biometrics. In Pakistan, voice-based services are very common in banking and mobile/cellular sector, however, these services do not use voice features to recognize customers. Therefore, the chance to use these services with false identity is always high. It is essential to design a voice-based recognition system to minimize the risk of false identity. In this paper, we developed regional voice datasets for voice biometrics, by collecting voice data in different local accents of Pakistan. Although, there is a global need for voice biometrics especially when voice-based services are common, however, this paper uses Pakistan as a use case to show how to build regional voice dataset for voice biometrics. To build voice dataset, voice samples were recorded from 180 male and female speakers with two languages English and Urdu in form of five regional accents. Mel Frequency Cepstral Coefficient (MFCC) features were extracted from the collected voice samples to train Support Vector Machine (SVM), Artificial Neural Network (ANN), Random Forest (RF) and K-nearest neighbor (KNN) classifiers. The results indicate that ANN outperformed SVM, RF and KNN by achieving 88.53% and 86.58% recognition accuracy on both datasets respectively.
first_indexed	2024-12-16T20:40:06Z
format	Article
id	doaj.art-2a2e67eeb826450db58161e5908467c3
institution	Directory Open Access Journal
issn	2196-1115
language	English
last_indexed	2024-12-16T20:40:06Z
publishDate	2021-03-01
publisher	SpringerOpen
record_format	Article
series	Journal of Big Data
spelling	doaj.art-2a2e67eeb826450db58161e5908467c32022-12-21T22:17:07ZengSpringerOpenJournal of Big Data2196-11152021-03-018111810.1186/s40537-021-00435-9Development of a regional voice dataset and speaker classification based on machine learningMuhammad Ismail0Shahzad Memon1Lachhman Das Dhomeja2Shahid Munir Shah3Dostdar Hussain4Sabit Rahim5Imran Ali6AHS Bukhari Institute of Information and Communication Technology, Faculty of Engineering and Technology, University of SindhAHS Bukhari Institute of Information and Communication Technology, Faculty of Engineering and Technology, University of SindhAHS Bukhari Institute of Information and Communication Technology, Faculty of Engineering and Technology, University of SindhDepartment of Computer Science, Barrett Hodgson UniversityDepartment of Computer Science, Karakoram International UniversityDepartment of Computer Science, Karakoram International UniversityDepartment of Computer Science, Karakoram International UniversityAbstract At present, voice biometrics are commonly used for identification and authentication of users through their voice. Voice based services such as mobile banking, access to personal devices, and logging into social networks are the common examples of authenticating users through voice biometrics. In Pakistan, voice-based services are very common in banking and mobile/cellular sector, however, these services do not use voice features to recognize customers. Therefore, the chance to use these services with false identity is always high. It is essential to design a voice-based recognition system to minimize the risk of false identity. In this paper, we developed regional voice datasets for voice biometrics, by collecting voice data in different local accents of Pakistan. Although, there is a global need for voice biometrics especially when voice-based services are common, however, this paper uses Pakistan as a use case to show how to build regional voice dataset for voice biometrics. To build voice dataset, voice samples were recorded from 180 male and female speakers with two languages English and Urdu in form of five regional accents. Mel Frequency Cepstral Coefficient (MFCC) features were extracted from the collected voice samples to train Support Vector Machine (SVM), Artificial Neural Network (ANN), Random Forest (RF) and K-nearest neighbor (KNN) classifiers. The results indicate that ANN outperformed SVM, RF and KNN by achieving 88.53% and 86.58% recognition accuracy on both datasets respectively.https://doi.org/10.1186/s40537-021-00435-9Speaker recognition systemsSpeakers classificationVoice databaseAccents and dialects
spellingShingle	Muhammad Ismail Shahzad Memon Lachhman Das Dhomeja Shahid Munir Shah Dostdar Hussain Sabit Rahim Imran Ali Development of a regional voice dataset and speaker classification based on machine learning Journal of Big Data Speaker recognition systems Speakers classification Voice database Accents and dialects
title	Development of a regional voice dataset and speaker classification based on machine learning
title_full	Development of a regional voice dataset and speaker classification based on machine learning
title_fullStr	Development of a regional voice dataset and speaker classification based on machine learning
title_full_unstemmed	Development of a regional voice dataset and speaker classification based on machine learning
title_short	Development of a regional voice dataset and speaker classification based on machine learning
title_sort	development of a regional voice dataset and speaker classification based on machine learning
topic	Speaker recognition systems Speakers classification Voice database Accents and dialects
url	https://doi.org/10.1186/s40537-021-00435-9
work_keys_str_mv	AT muhammadismail developmentofaregionalvoicedatasetandspeakerclassificationbasedonmachinelearning AT shahzadmemon developmentofaregionalvoicedatasetandspeakerclassificationbasedonmachinelearning AT lachhmandasdhomeja developmentofaregionalvoicedatasetandspeakerclassificationbasedonmachinelearning AT shahidmunirshah developmentofaregionalvoicedatasetandspeakerclassificationbasedonmachinelearning AT dostdarhussain developmentofaregionalvoicedatasetandspeakerclassificationbasedonmachinelearning AT sabitrahim developmentofaregionalvoicedatasetandspeakerclassificationbasedonmachinelearning AT imranali developmentofaregionalvoicedatasetandspeakerclassificationbasedonmachinelearning

Development of a regional voice dataset and speaker classification based on machine learning

Similar Items