Interspeech 2017 2017
DOI: 10.21437/interspeech.2017-776
|View full text |Cite
|
Sign up to set email alerts
|

Audio Replay Attack Detection Using High-Frequency Features

Abstract: This paper presents our contribution to the ASVspoof 2017 Challenge. It addresses a replay spoofing attack against a speaker recognition system by detecting that the analysed signal has passed through multiple analogue-to-digital (AD) conversions. Specifically, we show that most of the cues that enable to detect the replay attacks can be found in the high-frequency band of the replayed recordings. The described anti-spoofing countermeasures are based on (1) modelling the subband spectrum and (2) using the prop… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
69
1

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 128 publications
(71 citation statements)
references
References 24 publications
1
69
1
Order By: Relevance
“…To sum up, previous studies indicate that certain frequency subbands are potentially more informative to the detection of spoofing attacks, even though no standardized approach how that unevenly distributed information across the frequency axis should be utilized. Our current work is different from prior works [19,20,21,22,23] because most of them aim at handcrafting or learning features [24] based on the relevance of specific subbands for spoofing detection. To the best of our knowledge, there is no work in spoofing detection aiming to learn band-specific features by discriminatively training CNNs on a spectrogram input.…”
Section: Relation To Prior Workmentioning
confidence: 99%
“…To sum up, previous studies indicate that certain frequency subbands are potentially more informative to the detection of spoofing attacks, even though no standardized approach how that unevenly distributed information across the frequency axis should be utilized. Our current work is different from prior works [19,20,21,22,23] because most of them aim at handcrafting or learning features [24] based on the relevance of specific subbands for spoofing detection. To the best of our knowledge, there is no work in spoofing detection aiming to learn band-specific features by discriminatively training CNNs on a spectrogram input.…”
Section: Relation To Prior Workmentioning
confidence: 99%
“…At such distances, some acoustic features can be used to identify the sound source of the speaker, e.g., in [31], [32], the authors use the "pop noise" caused by breathing to identify a live speaker. Other efforts [33], [34], [35] do not explicitly use close distance features, but the databases they use to develop their defense strategies were recorded at In the recording phase, the attacker records or synthesizes a malicious voice command. In the playback phase, the malicious voice command is transmitted from the playback device to the victim device over the air.…”
Section: B Sound Source Identification Using Acoustic Cuesmentioning
confidence: 99%
“…The key idea in [9,10] is actually an extension of the antispoofing technologies used for protecting automatic speaker verification (ASV) systems. Many prior efforts (such as presented in [11,12,13,14,15,16,17,18,19,20,21,22]) have attempted to differentiate between original and replayed speech using the RedDots Replayed data set [23]. While the VCS and ASV protection tasks look similar, they have some important differences, e.g., they have fundamental different user scenarios: ASV systems usually assume that the user speaks in a controlled environment and in close proximity to the systems, while modern VCSs usually support far-field speech recognition and are often used in a variety of environmental conditions indoors and outdoors.…”
Section: Introductionmentioning
confidence: 99%