Universal Adversarial Perturbations for Speech Recognition Systems

Neekhara, Paarth; Hussain, Shehzeen; Pandey, Prakhar; Dubnov, Shlomo; McAuley, Julian; Koushanfar, Farinaz

doi:10.21437/interspeech.2019-1353

Cited by 83 publications

(36 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A closely related topic are adversarial attacks, first investigated by Szegedy et al (2013) and Goodfellow et al (2015) in computer vision and later extended to text classification (Papernot et al, 2016;Ebrahimi et al, 2018b;Li et al, 2018;Hosseini et al, 2017) and translation (Ebrahimi et al, 2018a;Michel et al, 2019). Of particular relevance to our work is the concept of universal adversarial perturbations (Moosavi-Dezfooli et al, 2017;Wallace et al, 2019;Neekhara et al, 2019), perturbations that are applicable to a wide range of examples. Specifically the adversarial triggers from Wallace et al (2019) are reminiscent of the attack proposed here, with the crucial difference that their attack fixes the model's weights and finds a specific trigger, whereas the attack we explore fixes the trigger and changes the model's weights to introduce a specific response.…”

Section: Related Workmentioning

confidence: 94%

Weight Poisoning Attacks on Pretrained Models

Kurita¹,

Michel²,

Neubig³

2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

184

187

View full text Add to dashboard Cite

Recently, NLP has seen a surge in the usage of large pre-trained models. Users download weights of models pre-trained on large datasets, then fine-tune the weights on a task of their choice. This raises the question of whether downloading untrusted pre-trained weights can pose a security threat. In this paper, we show that it is possible to construct "weight poisoning" attacks where pre-trained weights are injected with vulnerabilities that expose "backdoors" after fine-tuning, enabling the attacker to manipulate the model prediction simply by injecting an arbitrary keyword. We show that by applying a regularization method, which we call RIPPLe, and an initialization procedure, which we call Embedding Surgery, such attacks are possible even with limited knowledge of the dataset and finetuning procedure. Our experiments on sentiment classification, toxicity detection, and spam detection show that this attack is widely applicable and poses a serious threat. Finally, we outline practical defenses against such attacks.

show abstract

Section: Related Workmentioning

confidence: 94%

Weight Poisoning Attacks on Pretrained Models

Kurita¹,

Michel²,

Neubig³

2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

184

187

View full text Add to dashboard Cite

show abstract

“…Such a technique is often referred to as perturbations. Further, a single noise input can cause false recognition of any speech input, and it is termed as universal perturbation [76], [77]. Adversarial attacks can be non-targeted and targeted.…”

Section: ) Adversarial Perturbationsmentioning

confidence: 99%

Deep Speaker Recognition: Process, Progress, and Challenges

et al. 2021

View full text Add to dashboard Cite

Speaker recognition is related to human biometrics dealing with the identification of speakers from their speech. Speaker recognition is an active research area and being widely investigated using artificially intelligent mechanisms. Though speaker recognition systems were previously constructed using handcrafted statistical means of machine learning, currently it is being shifted to state-of-the-art deep learning strategies. Further, deep learning being a fast-paced domain, an absence of comprehensive survey is observed in the current deep speaker recognition technologies. In this paper, we focus on deep speaker recognition technologies. The paper particularly introduces a taxonomy, explains the progress, architectural strategies and processes of some distinctive approaches. Further, the manuscript classifies and enlists the currently available datasets and programming tools. Finally, the paper investigates the challenges and future directives of deep speaker recognition technology.

show abstract

“…However, all the aforementioned ASR adversarial attacks are individual attack through solving an optimization problem for each individual input audio, which needs high run-time requirements (e.g., several hours) to compute the adversarial examples per input audio. Alternatively, a more recent work [12] produces a single universal perturbation which can fool ASR systems causing an error in transcription. This work is in the case of untargeted attack, in which the adversary cannot specify the expected speech transcription during the phase of adversary example generation.…”

Section: Related Workmentioning

confidence: 99%

Real-Time, Universal, and Robust Adversarial Attacks Against Speaker Recognition Systems

Xie

Shi

et al. 2020

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

As the popularity of voice user interface (VUI) exploded in recent years, speaker recognition system has emerged as an important medium of identifying a speaker in many security-required applications and services. In this paper, we propose the first real-time, universal, and robust adversarial attack against the state-of-the-art deep neural network (DNN) based speaker recognition system. Through adding an audio-agnostic universal perturbation on arbitrary enrolled speaker's voice input, the DNN-based speaker recognition system would identify the speaker as any target (i.e., adversary-desired) speaker label. In addition, we improve the robustness of our attack by modeling the sound distortions caused by the physical over-the-air propagation through estimating room impulse response (RIR). Experiment using a public dataset of 109 English speakers demonstrates the effectiveness and robustness of our proposed attack with a high attack success rate of over 90%. The attack launching time also achieves a 100× speedup over contemporary non-universal attacks.

show abstract

Universal Adversarial Perturbations for Speech Recognition Systems

Cited by 83 publications

References 40 publications

Weight Poisoning Attacks on Pretrained Models

Weight Poisoning Attacks on Pretrained Models

Deep Speaker Recognition: Process, Progress, and Challenges

Real-Time, Universal, and Robust Adversarial Attacks Against Speaker Recognition Systems

Contact Info

Product

Resources

About