Imperceptible black-box waveform-level adversarial attack towards automatic speaker recognition

Abstract Automatic speaker recognition is an important biometric authentication approach with emerging applications. However, recent research has shown its vulnerability on adversarial attacks. In this paper, we propose a new type of adversarial examples by generating imperceptible adversarial sampl...

Full description

Bibliographic Details
Main Authors:	Xingyu Zhang, Xiongwei Zhang, Meng Sun, Xia Zou, Kejiang Chen, Nenghai Yu
Format:	Article
Language:	English
Published:	Springer 2022-06-01
Series:	Complex & Intelligent Systems
Subjects:	Automatic speaker recognition Adversarial examples Imperceptibility Black-box attack Differential evolution Auditory masking
Online Access:	https://doi.org/10.1007/s40747-022-00782-x

_version_	1827982254905753600
author	Xingyu Zhang Xiongwei Zhang Meng Sun Xia Zou Kejiang Chen Nenghai Yu
author_facet	Xingyu Zhang Xiongwei Zhang Meng Sun Xia Zou Kejiang Chen Nenghai Yu
author_sort	Xingyu Zhang
collection	DOAJ
description	Abstract Automatic speaker recognition is an important biometric authentication approach with emerging applications. However, recent research has shown its vulnerability on adversarial attacks. In this paper, we propose a new type of adversarial examples by generating imperceptible adversarial samples for targeted attacks on black-box systems of automatic speaker recognition. Waveform samples are created directly by solving an optimization problem with waveform inputs and outputs, which is more realistic in real-life scenario. Inspired by auditory masking, a regularization term adapting to the energy of speech waveform is proposed for generating imperceptible adversarial perturbations. The optimization problems are subsequently solved by differential evolution algorithm in a black-box manner which does not require any knowledge on the inner configuration of the recognition systems. Experiments conducted on commonly used data sets, LibriSpeech and VoxCeleb, show that the proposed methods have successfully performed targeted attacks on state-of-the-art speaker recognition systems while being imperceptible to human listeners. Given the high SNR and PESQ scores of the yielded adversarial samples, the proposed methods deteriorate less on the quality of the original signals than several recently proposed methods, which justifies the imperceptibility of adversarial samples.
first_indexed	2024-04-09T22:31:37Z
format	Article
id	doaj.art-4a2f3536ab004517885c1dbd82c81aa7
institution	Directory Open Access Journal
issn	2199-4536 2198-6053
language	English
last_indexed	2024-04-09T22:31:37Z
publishDate	2022-06-01
publisher	Springer
record_format	Article
series	Complex & Intelligent Systems
spelling	doaj.art-4a2f3536ab004517885c1dbd82c81aa72023-03-22T12:43:52ZengSpringerComplex & Intelligent Systems2199-45362198-60532022-06-0191657910.1007/s40747-022-00782-xImperceptible black-box waveform-level adversarial attack towards automatic speaker recognitionXingyu Zhang0Xiongwei Zhang1Meng Sun2Xia Zou3Kejiang Chen4Nenghai Yu5Laboratory of Intelligent Information Processing, Army Engineering UniversityLaboratory of Intelligent Information Processing, Army Engineering UniversityLaboratory of Intelligent Information Processing, Army Engineering UniversityLaboratory of Intelligent Information Processing, Army Engineering UniversityDepartment of Electronic Engineering and Information Science, University of Science and Technology of ChinaDepartment of Electronic Engineering and Information Science, University of Science and Technology of ChinaAbstract Automatic speaker recognition is an important biometric authentication approach with emerging applications. However, recent research has shown its vulnerability on adversarial attacks. In this paper, we propose a new type of adversarial examples by generating imperceptible adversarial samples for targeted attacks on black-box systems of automatic speaker recognition. Waveform samples are created directly by solving an optimization problem with waveform inputs and outputs, which is more realistic in real-life scenario. Inspired by auditory masking, a regularization term adapting to the energy of speech waveform is proposed for generating imperceptible adversarial perturbations. The optimization problems are subsequently solved by differential evolution algorithm in a black-box manner which does not require any knowledge on the inner configuration of the recognition systems. Experiments conducted on commonly used data sets, LibriSpeech and VoxCeleb, show that the proposed methods have successfully performed targeted attacks on state-of-the-art speaker recognition systems while being imperceptible to human listeners. Given the high SNR and PESQ scores of the yielded adversarial samples, the proposed methods deteriorate less on the quality of the original signals than several recently proposed methods, which justifies the imperceptibility of adversarial samples.https://doi.org/10.1007/s40747-022-00782-xAutomatic speaker recognitionAdversarial examplesImperceptibilityBlack-box attackDifferential evolutionAuditory masking
spellingShingle	Xingyu Zhang Xiongwei Zhang Meng Sun Xia Zou Kejiang Chen Nenghai Yu Imperceptible black-box waveform-level adversarial attack towards automatic speaker recognition Complex & Intelligent Systems Automatic speaker recognition Adversarial examples Imperceptibility Black-box attack Differential evolution Auditory masking
title	Imperceptible black-box waveform-level adversarial attack towards automatic speaker recognition
title_full	Imperceptible black-box waveform-level adversarial attack towards automatic speaker recognition
title_fullStr	Imperceptible black-box waveform-level adversarial attack towards automatic speaker recognition
title_full_unstemmed	Imperceptible black-box waveform-level adversarial attack towards automatic speaker recognition
title_short	Imperceptible black-box waveform-level adversarial attack towards automatic speaker recognition
title_sort	imperceptible black box waveform level adversarial attack towards automatic speaker recognition
topic	Automatic speaker recognition Adversarial examples Imperceptibility Black-box attack Differential evolution Auditory masking
url	https://doi.org/10.1007/s40747-022-00782-x
work_keys_str_mv	AT xingyuzhang imperceptibleblackboxwaveformleveladversarialattacktowardsautomaticspeakerrecognition AT xiongweizhang imperceptibleblackboxwaveformleveladversarialattacktowardsautomaticspeakerrecognition AT mengsun imperceptibleblackboxwaveformleveladversarialattacktowardsautomaticspeakerrecognition AT xiazou imperceptibleblackboxwaveformleveladversarialattacktowardsautomaticspeakerrecognition AT kejiangchen imperceptibleblackboxwaveformleveladversarialattacktowardsautomaticspeakerrecognition AT nenghaiyu imperceptibleblackboxwaveformleveladversarialattacktowardsautomaticspeakerrecognition

Imperceptible black-box waveform-level adversarial attack towards automatic speaker recognition

Similar Items