2021
DOI: 10.48550/arxiv.2103.10651
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

SoK: A Modularized Approach to Study the Security of Automatic Speech Recognition Systems

Abstract: With the wide use of Automatic Speech Recognition (ASR) in applications such as human machine interaction, simultaneous interpretation, audio transcription, etc., its security protection becomes increasingly important. Although recent studies have brought to light the weaknesses of popular ASR systems that enable out-of-band signal attack, adversarial attack, etc., and further proposed various remedies (signal smoothing, adversarial training, etc.), a systematic understanding of ASR security (both attacks and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 90 publications
(222 reference statements)
0
1
0
Order By: Relevance
“…ASR is an active and essential research area owing to its wide range of applications, such as security [ 2 ], education [ 3 ], smart healthcare [ 4 , 5 ], and smart cities [ 6 ], as well as the development of interfaces and computing instruments that can enable voice processing. It is a combination of various approaches that assist in the conversion of acoustic data into text, using text matching applied to the detected speech signal occurring in the result.…”
Section: Introductionmentioning
confidence: 99%
“…ASR is an active and essential research area owing to its wide range of applications, such as security [ 2 ], education [ 3 ], smart healthcare [ 4 , 5 ], and smart cities [ 6 ], as well as the development of interfaces and computing instruments that can enable voice processing. It is a combination of various approaches that assist in the conversion of acoustic data into text, using text matching applied to the detected speech signal occurring in the result.…”
Section: Introductionmentioning
confidence: 99%