Data augmentation based novel approach to automatic speaker verification system

In recent times, the Automatic Speaker Verification (ASV) systems have been gaining popularity among biometric systems. Feature extraction technique, classification model, and spoof dataset are the three main components that mainly affect the performance of such systems. In recent years, researchers...

Full description

Bibliographic Details
Main Authors: Mohit Dua, Sanil Joshi, Shelza Dua
Format: Article
Language:English
Published: Elsevier 2023-12-01
Series:e-Prime: Advances in Electrical Engineering, Electronics and Energy
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2772671123002413
_version_ 1797388634304806912
author Mohit Dua
Sanil Joshi
Shelza Dua
author_facet Mohit Dua
Sanil Joshi
Shelza Dua
author_sort Mohit Dua
collection DOAJ
description In recent times, the Automatic Speaker Verification (ASV) systems have been gaining popularity among biometric systems. Feature extraction technique, classification model, and spoof dataset are the three main components that mainly affect the performance of such systems. In recent years, researchers have proposed integrated or complex feature extraction methods at front end and combination of two or more machine learning models at backend to classify the audio samples into spoofed or bonafide. This increases the overall complexity of ASV system. Moreover, most of the spoofing datasets are highly imbalanced. Hence, to solve these issues, this paper improves front-end feature extraction by combining data augmentation approach Synthetic Minority Oversampling Technique (SMOTE) with less complex feature extraction method Frequency Domain Linear Prediction (FDLP). In the proposed system, classification of the audio samples into bonafide or spoofed samples has been done by using different backend classification models such as Random Forest (RF), K-Nearest Neighbor (KNN), Naïve Bayes (NB), Convolutional Neural Network (CNN), Long Short-Term Memory, and Bidirectional LSTM (BiLSTM). The proposed system has been built using ASVspoof PA 2019, ASVspoof LA 2019 and VSDC datasets, and has been evaluated on Logical Access (LA) attacks, Presentation attacks (PA) and multi-order replay attacks (MA). The obtained results show that combination of SMOTE oversampling with FDLP at frontend, and BiLSTM at backend outperforms all other implemented models. It provides Equal Error Rate (EER) value 0.85 %, 0.91 % and 0.55 % for PA, LA and MA attacks scenarios respectively. The performance of the proposed system has also been evaluated in the presence Gaussian noise. It can be interpreted from the obtained results that proposed FDLP-SMOTE-BiLSTM system provides better performance in noisy environment, and under different spoofing attacks scenarios.
first_indexed 2024-03-08T22:43:38Z
format Article
id doaj.art-c3c37d7f400646ce9a3d1f4fb324a72b
institution Directory Open Access Journal
issn 2772-6711
language English
last_indexed 2024-03-08T22:43:38Z
publishDate 2023-12-01
publisher Elsevier
record_format Article
series e-Prime: Advances in Electrical Engineering, Electronics and Energy
spelling doaj.art-c3c37d7f400646ce9a3d1f4fb324a72b2023-12-17T06:43:29ZengElseviere-Prime: Advances in Electrical Engineering, Electronics and Energy2772-67112023-12-016100346Data augmentation based novel approach to automatic speaker verification systemMohit Dua0Sanil Joshi1Shelza Dua2Department of Computer Engineering, National Institute of Technology, Kurukshetra, India; Corresponding author.Department of Computer Engineering, National Institute of Technology, Kurukshetra, IndiaDepartment of Electronics and Communication Engineering, National Institute of Technology, Kurukshetra, IndiaIn recent times, the Automatic Speaker Verification (ASV) systems have been gaining popularity among biometric systems. Feature extraction technique, classification model, and spoof dataset are the three main components that mainly affect the performance of such systems. In recent years, researchers have proposed integrated or complex feature extraction methods at front end and combination of two or more machine learning models at backend to classify the audio samples into spoofed or bonafide. This increases the overall complexity of ASV system. Moreover, most of the spoofing datasets are highly imbalanced. Hence, to solve these issues, this paper improves front-end feature extraction by combining data augmentation approach Synthetic Minority Oversampling Technique (SMOTE) with less complex feature extraction method Frequency Domain Linear Prediction (FDLP). In the proposed system, classification of the audio samples into bonafide or spoofed samples has been done by using different backend classification models such as Random Forest (RF), K-Nearest Neighbor (KNN), Naïve Bayes (NB), Convolutional Neural Network (CNN), Long Short-Term Memory, and Bidirectional LSTM (BiLSTM). The proposed system has been built using ASVspoof PA 2019, ASVspoof LA 2019 and VSDC datasets, and has been evaluated on Logical Access (LA) attacks, Presentation attacks (PA) and multi-order replay attacks (MA). The obtained results show that combination of SMOTE oversampling with FDLP at frontend, and BiLSTM at backend outperforms all other implemented models. It provides Equal Error Rate (EER) value 0.85 %, 0.91 % and 0.55 % for PA, LA and MA attacks scenarios respectively. The performance of the proposed system has also been evaluated in the presence Gaussian noise. It can be interpreted from the obtained results that proposed FDLP-SMOTE-BiLSTM system provides better performance in noisy environment, and under different spoofing attacks scenarios.http://www.sciencedirect.com/science/article/pii/S2772671123002413ASVFDLPBILSTMGaussian noiseSMOTE
spellingShingle Mohit Dua
Sanil Joshi
Shelza Dua
Data augmentation based novel approach to automatic speaker verification system
e-Prime: Advances in Electrical Engineering, Electronics and Energy
ASV
FDLP
BILSTM
Gaussian noise
SMOTE
title Data augmentation based novel approach to automatic speaker verification system
title_full Data augmentation based novel approach to automatic speaker verification system
title_fullStr Data augmentation based novel approach to automatic speaker verification system
title_full_unstemmed Data augmentation based novel approach to automatic speaker verification system
title_short Data augmentation based novel approach to automatic speaker verification system
title_sort data augmentation based novel approach to automatic speaker verification system
topic ASV
FDLP
BILSTM
Gaussian noise
SMOTE
url http://www.sciencedirect.com/science/article/pii/S2772671123002413
work_keys_str_mv AT mohitdua dataaugmentationbasednovelapproachtoautomaticspeakerverificationsystem
AT saniljoshi dataaugmentationbasednovelapproachtoautomaticspeakerverificationsystem
AT shelzadua dataaugmentationbasednovelapproachtoautomaticspeakerverificationsystem