2022
DOI: 10.1007/s42979-022-01557-4
|View full text |Cite
|
Sign up to set email alerts
|

Noise Robust ASV Spoof Detection Using Integrated Features and Time Delay Neural Network

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(3 citation statements)
references
References 28 publications
0
3
0
Order By: Relevance
“…Malik et.al. [21], represented replay attacks as a nonlinear process and proposed the ATP-GTCC [35] combination to detect the harmonic distortions. The suggested ATP-GTCC feature space is used to train a multi-class SVM classifier, and tested for audio replay attack detection using the ECOC model.…”
Section: A Related Workmentioning
confidence: 99%
“…Malik et.al. [21], represented replay attacks as a nonlinear process and proposed the ATP-GTCC [35] combination to detect the harmonic distortions. The suggested ATP-GTCC feature space is used to train a multi-class SVM classifier, and tested for audio replay attack detection using the ECOC model.…”
Section: A Related Workmentioning
confidence: 99%
“…MFCC [14], Perceptual Linear prediction (PLP) [15], and Linear Frequency Cepstral Coefficient (LFCC) [16] feature extraction techniques have remained popular among past researchers. However, in recent years, Gammatone Cepstral Coefficients (GTCC) [17,18] feature extraction technique have become popular due to its noise robustness feature. Also, in last few years, it has been shown by various researchers that use of hybrid or integrated features have improved the performance of the ASV systems to a great extent.…”
Section: Introductionmentioning
confidence: 99%
“…The proposed research focuses on constructing an Automatic Speech Recognition (ASR) system for the Gujarati language using publicly accessible datasets. It employs a combination of MFCC and CQCC based integrated front-end feature extraction methodologies [4,5], as well as Bidirectional Encoder Representations from Transformers (BERT)-based improved spell corrector algorithm and Gated Recurrent Units (GRU)-based DeepSpeech2 model architecture [6,7]. In comparison to a model without post-processing, the results of the proposed work demonstrate a significant improvement in Word Error Rate (WER).…”
Section: Introductionmentioning
confidence: 99%