2020
DOI: 10.1587/transinf.2019edl8115
|View full text |Cite
|
Sign up to set email alerts
|

Constant-Q Deep Coefficients for Playback Attack Detection

Abstract: Under the framework of traditional power spectrum based feature extraction, in order to extract more discriminative information for playback attack detection, this paper proposes a feature by making use of deep neural network to describe the nonlinear relationship between power spectrum and discriminative information. Namely, constant-Q deep coefficients (CQDC). It relies on constant-Q transform, deep neural network and discrete cosine transform. In which, constant-Q transform is used to convert signal from th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 13 publications
0
3
0
Order By: Relevance
“…The spoofing attacks are mainly categorized into two types: physical access (PA) and logical access (LA). The PA considers the replay attack [2][3][4] while the LA includes the spoofing attacks based on text-to-speech synthesis [5][6][7][8] and voice conversion [9][10][11] technologies.…”
Section: Introductionmentioning
confidence: 99%
“…The spoofing attacks are mainly categorized into two types: physical access (PA) and logical access (LA). The PA considers the replay attack [2][3][4] while the LA includes the spoofing attacks based on text-to-speech synthesis [5][6][7][8] and voice conversion [9][10][11] technologies.…”
Section: Introductionmentioning
confidence: 99%
“…Apart from the handcrafted features, some deep features based on power spectrum were also explored in recent years. For example, constant-Q deep coefficients [20], that is obtained from a DNN by using log LPS as the input plus discrete cosine transform to extract the principal information at the last step. The use of light convolutional neural network with LPS as the input to extract deep feature for replay speech detection has been studied in [5], [7].…”
Section: Introductionmentioning
confidence: 99%
“…• Different to the previous works, LFNS is used to feed neural networks such as DNN and ResNet to form frame-and utterance-based end-to-end systems for replay speech answer-sheet detection. The reason why we select DNN as the frame-based neural network and ResNet as the utterance-based neural network here is that they are widely used in the field of spoofing attack detection such DNN in [12], [19], [20] and ResNet in [22]. In addition, we also construct classifiers for CNOC and CNOC-LSE using DNN and ResNet, respectively.…”
Section: Introductionmentioning
confidence: 99%