Leveraging Domain Features for Detecting Adversarial Attacks Against Deep Speech Recognition in Noise

In recent years, significant progress has been made in deep model-based automatic speech recognition (ASR), leading to its widespread deployment in the real world. At the same time, adversarial attacks against deep ASR systems are highly successful. Various methods have been proposed to defend ASR s...

Full description

Bibliographic Details
Main Authors: Christian Heider Nielsen, Zheng-Hua Tan
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Open Journal of Signal Processing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10076798/
_version_ 1797854719941541888
author Christian Heider Nielsen
Zheng-Hua Tan
author_facet Christian Heider Nielsen
Zheng-Hua Tan
author_sort Christian Heider Nielsen
collection DOAJ
description In recent years, significant progress has been made in deep model-based automatic speech recognition (ASR), leading to its widespread deployment in the real world. At the same time, adversarial attacks against deep ASR systems are highly successful. Various methods have been proposed to defend ASR systems from these attacks. However, existing classification based methods focus on the design of deep learning models while lacking exploration of domain specific features. This work leverages filter bank-based features to better capture the characteristics of attacks for improved detection. Furthermore, the paper analyses the potentials of using speech and non-speech parts separately in detecting adversarial attacks. In the end, considering adverse environments where ASR systems may be deployed, we study the impact of acoustic noise of various types and signal-to-noise ratios. Extensive experiments show that the inverse filter bank features generally perform better in both clean and noisy environments, the detection is effective using either speech or non-speech part, and the acoustic noise can largely degrade the detection performance.
first_indexed 2024-04-09T20:10:42Z
format Article
id doaj.art-32d3850bab134e44a8e4b19f082e6a63
institution Directory Open Access Journal
issn 2644-1322
language English
last_indexed 2024-04-09T20:10:42Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Open Journal of Signal Processing
spelling doaj.art-32d3850bab134e44a8e4b19f082e6a632023-03-31T23:00:24ZengIEEEIEEE Open Journal of Signal Processing2644-13222023-01-01417918710.1109/OJSP.2023.325632110076798Leveraging Domain Features for Detecting Adversarial Attacks Against Deep Speech Recognition in NoiseChristian Heider Nielsen0Zheng-Hua Tan1https://orcid.org/0000-0001-6856-8928Department of Electronic Systems, Aalborg University, Aalborg, DenmarkDepartment of Electronic Systems, Aalborg University, Aalborg, DenmarkIn recent years, significant progress has been made in deep model-based automatic speech recognition (ASR), leading to its widespread deployment in the real world. At the same time, adversarial attacks against deep ASR systems are highly successful. Various methods have been proposed to defend ASR systems from these attacks. However, existing classification based methods focus on the design of deep learning models while lacking exploration of domain specific features. This work leverages filter bank-based features to better capture the characteristics of attacks for improved detection. Furthermore, the paper analyses the potentials of using speech and non-speech parts separately in detecting adversarial attacks. In the end, considering adverse environments where ASR systems may be deployed, we study the impact of acoustic noise of various types and signal-to-noise ratios. Extensive experiments show that the inverse filter bank features generally perform better in both clean and noisy environments, the detection is effective using either speech or non-speech part, and the acoustic noise can largely degrade the detection performance.https://ieeexplore.ieee.org/document/10076798/Adversarial examplesautomatic speech recognitiondeep learningfilter banknoise robustness
spellingShingle Christian Heider Nielsen
Zheng-Hua Tan
Leveraging Domain Features for Detecting Adversarial Attacks Against Deep Speech Recognition in Noise
IEEE Open Journal of Signal Processing
Adversarial examples
automatic speech recognition
deep learning
filter bank
noise robustness
title Leveraging Domain Features for Detecting Adversarial Attacks Against Deep Speech Recognition in Noise
title_full Leveraging Domain Features for Detecting Adversarial Attacks Against Deep Speech Recognition in Noise
title_fullStr Leveraging Domain Features for Detecting Adversarial Attacks Against Deep Speech Recognition in Noise
title_full_unstemmed Leveraging Domain Features for Detecting Adversarial Attacks Against Deep Speech Recognition in Noise
title_short Leveraging Domain Features for Detecting Adversarial Attacks Against Deep Speech Recognition in Noise
title_sort leveraging domain features for detecting adversarial attacks against deep speech recognition in noise
topic Adversarial examples
automatic speech recognition
deep learning
filter bank
noise robustness
url https://ieeexplore.ieee.org/document/10076798/
work_keys_str_mv AT christianheidernielsen leveragingdomainfeaturesfordetectingadversarialattacksagainstdeepspeechrecognitioninnoise
AT zhenghuatan leveragingdomainfeaturesfordetectingadversarialattacksagainstdeepspeechrecognitioninnoise