Two-Tier Feature Extraction with Metaheuristics-Based Automated Forensic Speaker Verification Model

While speaker verification represents a critically important application of speaker recognition, it is also the most challenging and least well-understood application. Robust feature extraction plays an integral role in enhancing the efficiency of forensic speaker verification. Although the speech s...

Full description

Bibliographic Details
Main Authors:	Gaurav, Saurabh Bhardwaj, Ravinder Agarwal
Format:	Article
Language:	English
Published:	MDPI AG 2023-05-01
Series:	Electronics
Subjects:	automated speaker recognition deep learning ant lion optimizer feature extraction spectrograms speech signals
Online Access:	https://www.mdpi.com/2079-9292/12/10/2342

_version_	1797600359875608576
author	Gaurav Saurabh Bhardwaj Ravinder Agarwal
author_facet	Gaurav Saurabh Bhardwaj Ravinder Agarwal
author_sort	Gaurav
collection	DOAJ
description	While speaker verification represents a critically important application of speaker recognition, it is also the most challenging and least well-understood application. Robust feature extraction plays an integral role in enhancing the efficiency of forensic speaker verification. Although the speech signal is a continuous one-dimensional time series, most recent models depend on recurrent neural network (RNN) or convolutional neural network (CNN) models, which are not able to exhaustively represent human speech, thus opening themselves up to speech forgery. As a result, to accurately simulate human speech and to further ensure speaker authenticity, we must establish a reliable technique. This research article presents a Two-Tier Feature Extraction with Metaheuristics-Based Automated Forensic Speaker Verification (TTFEM-AFSV) model, which aims to overcome the limitations of the previous models. The TTFEM-AFSV model focuses on verifying speakers in forensic applications by exploiting the average median filtering (AMF) technique to discard the noise in speech signals. Subsequently, the MFCC and spectrograms are considered as the inputs to the deep convolutional neural network-based Inception v3 model, and the Ant Lion Optimizer (ALO) algorithm is utilized to fine-tune the hyperparameters related to the Inception v3 model. Finally, a long short-term memory with a recurrent neural network (LSTM-RNN) mechanism is employed as a classifier for automated speaker recognition. The performance validation of the TTFEM-AFSV model was tested in a series of experiments. Comparative study revealed the significantly improved performance of the TTFEM-AFSV model over recent approaches.
first_indexed	2024-03-11T03:47:01Z
format	Article
id	doaj.art-b8511047019f4d36ba4fbfc850be9fa8
institution	Directory Open Access Journal
issn	2079-9292
language	English
last_indexed	2024-03-11T03:47:01Z
publishDate	2023-05-01
publisher	MDPI AG
record_format	Article
series	Electronics
spelling	doaj.art-b8511047019f4d36ba4fbfc850be9fa82023-11-18T01:11:03ZengMDPI AGElectronics2079-92922023-05-011210234210.3390/electronics12102342Two-Tier Feature Extraction with Metaheuristics-Based Automated Forensic Speaker Verification ModelGaurav0Saurabh Bhardwaj1Ravinder Agarwal2Electrical and Instrumentation Engineering Department, Thapar Institute of Engineering and Technology, Patiala 147004, IndiaElectrical and Instrumentation Engineering Department, Thapar Institute of Engineering and Technology, Patiala 147004, IndiaElectrical and Instrumentation Engineering Department, Thapar Institute of Engineering and Technology, Patiala 147004, IndiaWhile speaker verification represents a critically important application of speaker recognition, it is also the most challenging and least well-understood application. Robust feature extraction plays an integral role in enhancing the efficiency of forensic speaker verification. Although the speech signal is a continuous one-dimensional time series, most recent models depend on recurrent neural network (RNN) or convolutional neural network (CNN) models, which are not able to exhaustively represent human speech, thus opening themselves up to speech forgery. As a result, to accurately simulate human speech and to further ensure speaker authenticity, we must establish a reliable technique. This research article presents a Two-Tier Feature Extraction with Metaheuristics-Based Automated Forensic Speaker Verification (TTFEM-AFSV) model, which aims to overcome the limitations of the previous models. The TTFEM-AFSV model focuses on verifying speakers in forensic applications by exploiting the average median filtering (AMF) technique to discard the noise in speech signals. Subsequently, the MFCC and spectrograms are considered as the inputs to the deep convolutional neural network-based Inception v3 model, and the Ant Lion Optimizer (ALO) algorithm is utilized to fine-tune the hyperparameters related to the Inception v3 model. Finally, a long short-term memory with a recurrent neural network (LSTM-RNN) mechanism is employed as a classifier for automated speaker recognition. The performance validation of the TTFEM-AFSV model was tested in a series of experiments. Comparative study revealed the significantly improved performance of the TTFEM-AFSV model over recent approaches.https://www.mdpi.com/2079-9292/12/10/2342automated speaker recognitiondeep learningant lion optimizerfeature extractionspectrogramsspeech signals
spellingShingle	Gaurav Saurabh Bhardwaj Ravinder Agarwal Two-Tier Feature Extraction with Metaheuristics-Based Automated Forensic Speaker Verification Model Electronics automated speaker recognition deep learning ant lion optimizer feature extraction spectrograms speech signals
title	Two-Tier Feature Extraction with Metaheuristics-Based Automated Forensic Speaker Verification Model
title_full	Two-Tier Feature Extraction with Metaheuristics-Based Automated Forensic Speaker Verification Model
title_fullStr	Two-Tier Feature Extraction with Metaheuristics-Based Automated Forensic Speaker Verification Model
title_full_unstemmed	Two-Tier Feature Extraction with Metaheuristics-Based Automated Forensic Speaker Verification Model
title_short	Two-Tier Feature Extraction with Metaheuristics-Based Automated Forensic Speaker Verification Model
title_sort	two tier feature extraction with metaheuristics based automated forensic speaker verification model
topic	automated speaker recognition deep learning ant lion optimizer feature extraction spectrograms speech signals
url	https://www.mdpi.com/2079-9292/12/10/2342
work_keys_str_mv	AT gaurav twotierfeatureextractionwithmetaheuristicsbasedautomatedforensicspeakerverificationmodel AT saurabhbhardwaj twotierfeatureextractionwithmetaheuristicsbasedautomatedforensicspeakerverificationmodel AT ravinderagarwal twotierfeatureextractionwithmetaheuristicsbasedautomatedforensicspeakerverificationmodel

Two-Tier Feature Extraction with Metaheuristics-Based Automated Forensic Speaker Verification Model

Similar Items