2018 IEEE 4th International Conference on Identity, Security, and Behavior Analysis (ISBA) 2018
DOI: 10.1109/isba.2018.8311474
|View full text |Cite
|
Sign up to set email alerts
|

On the use of convolutional neural networks for speech presentation attack detection

Abstract: Research in the area of automatic speaker verification (ASV) has advanced enough for the industry to start using ASV systems in practical applications. However, these systems are highly vulnerable to spoofing or presentation attacks (PAs), limiting their wide deployment. Several speechbased presentation attack detection (PAD) methods have been proposed recently but most of them are based on hand crafted frequency or phase-based features. Although convolutional neural networks (CNN) have already shown breakthro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 18 publications
(5 citation statements)
references
References 21 publications
(34 reference statements)
0
5
0
Order By: Relevance
“…The Portuguese language BioCPqD-PA dataset [71] was collected by recording 222 people in a variety of environmental conditions, and is comprised of 27,253 authentic recordings and 391,687 samples which have been subjected to a presentation attack. One laptop was used with 24 different setups consisting of 8 loudspeakers and 3 microphones, while another single laptop was used to capture real data.…”
Section: Other Publicly Available Datasets and Asv Challengesmentioning
confidence: 99%
“…The Portuguese language BioCPqD-PA dataset [71] was collected by recording 222 people in a variety of environmental conditions, and is comprised of 27,253 authentic recordings and 391,687 samples which have been subjected to a presentation attack. One laptop was used with 24 different setups consisting of 8 loudspeakers and 3 microphones, while another single laptop was used to capture real data.…”
Section: Other Publicly Available Datasets and Asv Challengesmentioning
confidence: 99%
“…There are other publicly available databases. One is the AVspoof database [50] and its extension called VoicePA [49]. Another one is the Spoofing and Antispoofing (SAS) [105] database.…”
Section: Datasets and Campaignsmentioning
confidence: 99%
“…At such short distances, certain acoustic features can be used to identify the sound source of the speaker, e.g., in [25], the authors use the "pop noise" caused by breathing to identify a live speaker. Other efforts [26,27] do not explicitly use close-distance features, but the databases they use to develop their defense strategies were recorded at close distances [23,28,29], and therefore, these approaches may also implicitly use close-distance features. In contrast, with the help of far-field speech recognition techniques, modern VCSs can typically accept voice commands from rather long distances (i.e., several meters) [10].…”
Section: Comparing Vcs and Asv Protectionmentioning
confidence: 99%