1993
DOI: 10.1109/5.237532
|View full text |Cite
|
Sign up to set email alerts
|

Signal modeling techniques in speech recognition

Abstract: We have seen three important trends develop in the last five years in speech recognition. First, heterogeneous parameter sets that mix absolute spectral information with dynamic, or timederivative, spectral information, have become common. Second, similariry transform techniques, often used to normalize and decorrelate parameters in some computationally inexpensive way, have become popular. Third, the signal parameter estimation problem has merged with the speech recognition process so that more sophisticated … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
266
0
29

Year Published

1999
1999
2016
2016

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 588 publications
(295 citation statements)
references
References 103 publications
0
266
0
29
Order By: Relevance
“…Some systems only use the derivative of the features, not the absolute features [7,19]. Using the derivative of the signal measurements tends to amplify noise [10] but, at the same time, filters the distortions produced in linear time invariant, or slowly varying channels (like an equalization). Cepstrum Mean Normalization (CMN) is used to reduce linear slowly varying channel distortions in [24].…”
Section: Post-processingmentioning
confidence: 99%
See 1 more Smart Citation
“…Some systems only use the derivative of the features, not the absolute features [7,19]. Using the derivative of the signal measurements tends to amplify noise [10] but, at the same time, filters the distortions produced in linear time invariant, or slowly varying channels (like an equalization). Cepstrum Mean Normalization (CMN) is used to reduce linear slowly varying channel distortions in [24].…”
Section: Post-processingmentioning
confidence: 99%
“…In analogy to the cryptographic hash value, content-based digital signatures can be seen as evolved versions of hash values that are robust to content-preserving transformations [4,5]. Also from a pattern matching point of view, the idea of extracting the essence of a class of objects retaining its main characteristics is at the heart of any classification system [6][7][8][9][10].…”
Section: Introductionmentioning
confidence: 99%
“…In the literature, some models have been proposed for representing speech features [1,[3][4][5][6][17][18][19]. In this paper, four well-known models, Linear Predict Coding Cepstrum (LPCC) [1], Fourier Transform Cepstral Coefficients Test Utterance (Sec)…”
Section: Comparison With Existing Methodsmentioning
confidence: 99%
“…(FTCC) [17], Generalized Mel Frequency Cepstral Coefficients (GMFCC) [3] and Wavelet Packet Transform (WPT) [18], are compared. These different modeling techniques are worthy of comparison because they represent different ways of modeling the acoustic feature distribution.…”
Section: Comparison With Existing Methodsmentioning
confidence: 99%
“…The types of signal models are deterministic and statistical models. In statistical model (Picone 1993) one tries to characterize the statistical properties of the signal. In HMM for each state, there is an output probability distribution of an acoustic vector, and each iteration is associated with a state-transition probability.…”
Section: Development Of Training Modelsmentioning
confidence: 99%