Data augmentation based novel approach to automatic speaker verification system

In recent times, the Automatic Speaker Verification (ASV) systems have been gaining popularity among biometric systems. Feature extraction technique, classification model, and spoof dataset are the three main components that mainly affect the performance of such systems. In recent years, researchers...

Full description

Bibliographic Details
Main Authors:	Mohit Dua, Sanil Joshi, Shelza Dua
Format:	Article
Language:	English
Published:	Elsevier 2023-12-01
Series:	e-Prime: Advances in Electrical Engineering, Electronics and Energy
Subjects:	ASV FDLP BILSTM Gaussian noise SMOTE
Online Access:	http://www.sciencedirect.com/science/article/pii/S2772671123002413

_version_	1797388634304806912
author	Mohit Dua Sanil Joshi Shelza Dua
author_facet	Mohit Dua Sanil Joshi Shelza Dua
author_sort	Mohit Dua
collection	DOAJ
description	In recent times, the Automatic Speaker Verification (ASV) systems have been gaining popularity among biometric systems. Feature extraction technique, classification model, and spoof dataset are the three main components that mainly affect the performance of such systems. In recent years, researchers have proposed integrated or complex feature extraction methods at front end and combination of two or more machine learning models at backend to classify the audio samples into spoofed or bonafide. This increases the overall complexity of ASV system. Moreover, most of the spoofing datasets are highly imbalanced. Hence, to solve these issues, this paper improves front-end feature extraction by combining data augmentation approach Synthetic Minority Oversampling Technique (SMOTE) with less complex feature extraction method Frequency Domain Linear Prediction (FDLP). In the proposed system, classification of the audio samples into bonafide or spoofed samples has been done by using different backend classification models such as Random Forest (RF), K-Nearest Neighbor (KNN), Naïve Bayes (NB), Convolutional Neural Network (CNN), Long Short-Term Memory, and Bidirectional LSTM (BiLSTM). The proposed system has been built using ASVspoof PA 2019, ASVspoof LA 2019 and VSDC datasets, and has been evaluated on Logical Access (LA) attacks, Presentation attacks (PA) and multi-order replay attacks (MA). The obtained results show that combination of SMOTE oversampling with FDLP at frontend, and BiLSTM at backend outperforms all other implemented models. It provides Equal Error Rate (EER) value 0.85 %, 0.91 % and 0.55 % for PA, LA and MA attacks scenarios respectively. The performance of the proposed system has also been evaluated in the presence Gaussian noise. It can be interpreted from the obtained results that proposed FDLP-SMOTE-BiLSTM system provides better performance in noisy environment, and under different spoofing attacks scenarios.
first_indexed	2024-03-08T22:43:38Z
format	Article
id	doaj.art-c3c37d7f400646ce9a3d1f4fb324a72b
institution	Directory Open Access Journal
issn	2772-6711
language	English
last_indexed	2024-03-08T22:43:38Z
publishDate	2023-12-01
publisher	Elsevier
record_format	Article
series	e-Prime: Advances in Electrical Engineering, Electronics and Energy
spelling	doaj.art-c3c37d7f400646ce9a3d1f4fb324a72b2023-12-17T06:43:29ZengElseviere-Prime: Advances in Electrical Engineering, Electronics and Energy2772-67112023-12-016100346Data augmentation based novel approach to automatic speaker verification systemMohit Dua0Sanil Joshi1Shelza Dua2Department of Computer Engineering, National Institute of Technology, Kurukshetra, India; Corresponding author.Department of Computer Engineering, National Institute of Technology, Kurukshetra, IndiaDepartment of Electronics and Communication Engineering, National Institute of Technology, Kurukshetra, IndiaIn recent times, the Automatic Speaker Verification (ASV) systems have been gaining popularity among biometric systems. Feature extraction technique, classification model, and spoof dataset are the three main components that mainly affect the performance of such systems. In recent years, researchers have proposed integrated or complex feature extraction methods at front end and combination of two or more machine learning models at backend to classify the audio samples into spoofed or bonafide. This increases the overall complexity of ASV system. Moreover, most of the spoofing datasets are highly imbalanced. Hence, to solve these issues, this paper improves front-end feature extraction by combining data augmentation approach Synthetic Minority Oversampling Technique (SMOTE) with less complex feature extraction method Frequency Domain Linear Prediction (FDLP). In the proposed system, classification of the audio samples into bonafide or spoofed samples has been done by using different backend classification models such as Random Forest (RF), K-Nearest Neighbor (KNN), Naïve Bayes (NB), Convolutional Neural Network (CNN), Long Short-Term Memory, and Bidirectional LSTM (BiLSTM). The proposed system has been built using ASVspoof PA 2019, ASVspoof LA 2019 and VSDC datasets, and has been evaluated on Logical Access (LA) attacks, Presentation attacks (PA) and multi-order replay attacks (MA). The obtained results show that combination of SMOTE oversampling with FDLP at frontend, and BiLSTM at backend outperforms all other implemented models. It provides Equal Error Rate (EER) value 0.85 %, 0.91 % and 0.55 % for PA, LA and MA attacks scenarios respectively. The performance of the proposed system has also been evaluated in the presence Gaussian noise. It can be interpreted from the obtained results that proposed FDLP-SMOTE-BiLSTM system provides better performance in noisy environment, and under different spoofing attacks scenarios.http://www.sciencedirect.com/science/article/pii/S2772671123002413ASVFDLPBILSTMGaussian noiseSMOTE
spellingShingle	Mohit Dua Sanil Joshi Shelza Dua Data augmentation based novel approach to automatic speaker verification system e-Prime: Advances in Electrical Engineering, Electronics and Energy ASV FDLP BILSTM Gaussian noise SMOTE
title	Data augmentation based novel approach to automatic speaker verification system
title_full	Data augmentation based novel approach to automatic speaker verification system
title_fullStr	Data augmentation based novel approach to automatic speaker verification system
title_full_unstemmed	Data augmentation based novel approach to automatic speaker verification system
title_short	Data augmentation based novel approach to automatic speaker verification system
title_sort	data augmentation based novel approach to automatic speaker verification system
topic	ASV FDLP BILSTM Gaussian noise SMOTE
url	http://www.sciencedirect.com/science/article/pii/S2772671123002413
work_keys_str_mv	AT mohitdua dataaugmentationbasednovelapproachtoautomaticspeakerverificationsystem AT saniljoshi dataaugmentationbasednovelapproachtoautomaticspeakerverificationsystem AT shelzadua dataaugmentationbasednovelapproachtoautomaticspeakerverificationsystem

Data augmentation based novel approach to automatic speaker verification system

Similar Items