Interspeech 2015 2015
DOI: 10.21437/interspeech.2015-467
|View full text |Cite
|
Sign up to set email alerts
|

Combining evidences from mel cepstral, cochlear filter cepstral and instantaneous frequency features for detection of natural vs. spoofed speech

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
23
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 86 publications
(23 citation statements)
references
References 18 publications
0
23
0
Order By: Relevance
“…Designing a single model to robustly detect unseen spoofing attacks can be challenging, as demonstrated at the ASVspoof 2015 and 2017 challenges, where the best performing systems [11,12,13,14,15,16] made use of an ensemble model combining features or scores. In this paper, we investigate LA and PA spoofing detection on the ASVspoof 2019 dataset using ensemble models.…”
Section: Introductionmentioning
confidence: 99%
“…Designing a single model to robustly detect unseen spoofing attacks can be challenging, as demonstrated at the ASVspoof 2015 and 2017 challenges, where the best performing systems [11,12,13,14,15,16] made use of an ensemble model combining features or scores. In this paper, we investigate LA and PA spoofing detection on the ASVspoof 2019 dataset using ensemble models.…”
Section: Introductionmentioning
confidence: 99%
“…The studies on fake audio detection are usually carried out from two aspects. The first is robust acoustic features based on signal processing methods [4,5,6]. The second is effective classifiers based on neural networks [7,8,9,10].…”
Section: Introductionmentioning
confidence: 99%
“…CQCC with Gaussian Mixture Models (GMMs) [9] is now a standard system used in spoofing detection for ASV. In [10], Cochlear filter cepstral coefficients (CFCCs) and changes in instantaneous frequencies (CFCCIFs) have been proposed for training two simple GMM classifiers for the detection of genuine and spoofing speech.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, the use of high time-frequency resolution features has become a popular approach [10][11][12]. Higher accuracy has been achieved by directly using CQT spectrograms, from which CQCC features are extracted, together with deep neural networks (DNNs).…”
Section: Introductionmentioning
confidence: 99%