Interspeech 2018 2018
DOI: 10.21437/interspeech.2018-1297
|View full text |Cite
|
Sign up to set email alerts
|

Exploration of Compressed ILPR Features for Replay Attack Detection

Abstract: This paper deals with the problem of detecting replay attacks on speaker verification systems. In literature, apart from the acoustic features, source features have also been successfully used for this task. In existing source features, only the information around glottal closure instants (GCIs) have been utilized. We hypothesize that the feature derived by capturing the temporal dynamics between two GCIs would be more discriminative for such task. Motivated by that, in this work we explore the use of discrete… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(3 citation statements)
references
References 18 publications
0
3
0
Order By: Relevance
“…Therefore, developing an effective countermeasure to distinguish between genuine and replayed samples has become a recent research focus [6], [7], [8], [9]. While there have been many prior efforts in this area [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23]), they only focus on detecting replay attacks based on single-channel input and therefore only leverage the temporal and spectral features. However, we identified three reasons why a countermeasure designed using multi-channel audio input could provide improved performance.…”
Section: Introductionmentioning
confidence: 99%
“…Therefore, developing an effective countermeasure to distinguish between genuine and replayed samples has become a recent research focus [6], [7], [8], [9]. While there have been many prior efforts in this area [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23]), they only focus on detecting replay attacks based on single-channel input and therefore only leverage the temporal and spectral features. However, we identified three reasons why a countermeasure designed using multi-channel audio input could provide improved performance.…”
Section: Introductionmentioning
confidence: 99%
“…The key idea in [9,10] is actually an extension of the antispoofing technologies used for protecting automatic speaker verification (ASV) systems. Many prior efforts (such as presented in [11,12,13,14,15,16,17,18,19,20,21,22]) have attempted to differentiate between original and replayed speech using the RedDots Replayed data set [23]. While the VCS and ASV protection tasks look similar, they have some important differences, e.g., they have fundamental different user scenarios: ASV systems usually assume that the user speaks in a controlled environment and in close proximity to the systems, while modern VCSs usually support far-field speech recognition and are often used in a variety of environmental conditions indoors and outdoors.…”
Section: Introductionmentioning
confidence: 99%
“…To detect replayed speech from genuine speech, there are several approaches. Some research has focused on tuning the classifier [11][12][13][14], whereas other research has focused on feature extraction [15,16]. As an anti-spoofing task mainly focuses on the characteristics of the given speech, in this paper, we focus on a new feature extraction method that provides better discrimination between replayed speech and genuine speech.…”
Section: Introductionmentioning
confidence: 99%