[Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing 1991
DOI: 10.1109/icassp.1991.151073
|View full text |Cite
|
Sign up to set email alerts
|

Velocity and acceleration features in speaker recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0
1

Year Published

1995
1995
2016
2016

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 28 publications
(8 citation statements)
references
References 2 publications
0
7
0
1
Order By: Relevance
“…The second ones (denoted with ∆∆ f , as the vector of ∆∆ coefficients for the f -th frame) are obtained by performing again the time derivatives on the first ∆ f coefficients. Usually, these coefficients are termed as velocityand acceleration-coefficients, respectively [6]. The motivation to employ, for each frame f , only the MFCC and ∆∆ coefficients relies on the fact that the feature vector X f = [C f ∆∆ f ] (obtained by concatenating the MFCC vector C f and the ∆∆ f one) provides the best results in terms of trade-off between percentage of correct recognitions and time needed to perform a single recognition.…”
Section: A Front-end and Feature Extractionmentioning
confidence: 99%
See 1 more Smart Citation
“…The second ones (denoted with ∆∆ f , as the vector of ∆∆ coefficients for the f -th frame) are obtained by performing again the time derivatives on the first ∆ f coefficients. Usually, these coefficients are termed as velocityand acceleration-coefficients, respectively [6]. The motivation to employ, for each frame f , only the MFCC and ∆∆ coefficients relies on the fact that the feature vector X f = [C f ∆∆ f ] (obtained by concatenating the MFCC vector C f and the ∆∆ f one) provides the best results in terms of trade-off between percentage of correct recognitions and time needed to perform a single recognition.…”
Section: A Front-end and Feature Extractionmentioning
confidence: 99%
“…The problem of recognizing a speaker among a close-set of speakers is widely studied in the speech processing literature (see, among the others [3], [6] and references therein). The main difference between this work and most of the other works in the literature is that the speaker recognition algorithm embedded within SPECTRA is able to dynamically increase the number of recognizable speakers.…”
Section: B Speaker Recognitionmentioning
confidence: 99%
“…A widely method to encode some of the dynamic information over time of spectral features is known as delta features "∆" [3,5]. The time derivatives of each cepstral coefficient are obtained by differentiation and zero padding at begin and end of the utterance, then, the estimate of the derivative is appended to the acoustic vector, yielding a higher-dimensional feature vector.…”
Section: Extracting Delta and Delta-delta Featuresmentioning
confidence: 99%
“…Due to the importance of such parameters, many researchers focused on improving them and trying to capture dynamic information more efficiently. One example is the use of regression features [11,12] instead of first and second order derivatives. Another recently proposed technique that uses DCT-based contextualization [13] proposes to replace MFCC features and their derivatives by a 2D-DCT transform applied on the Mel filter bank outputs.…”
Section: Introductionmentioning
confidence: 99%