2007
DOI: 10.1561/2000000004
|View full text |Cite
|
Sign up to set email alerts
|

The Application of Hidden Markov Models in Speech Recognition

Abstract: Hidden Markov Models (HMMs) provide a simple and effective framework for modelling time-varying spectral vector sequences. As a consequence, almost all present day large vocabulary continuous speech recognition (LVCSR) systems are based on HMMs.Whereas the basic principles underlying HMM-based LVCSR are rather straightforward, the approximations and simplifying assumptions involved in a direct implementation of these principles would result in a system which has poor accuracy and unacceptable sensitivity to ch… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
174
0
2

Year Published

2012
2012
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 439 publications
(176 citation statements)
references
References 140 publications
(210 reference statements)
0
174
0
2
Order By: Relevance
“…The phrase modeling in this application is done by a whole-phrase continuous HMM [2,10,27]. The selected model is with a left-to-right topology with no skip state and the output distributions are represented as mixture of Gaussians with diagonal covariance matrices.…”
Section: Hmm Speaker Verificationmentioning
confidence: 99%
See 2 more Smart Citations
“…The phrase modeling in this application is done by a whole-phrase continuous HMM [2,10,27]. The selected model is with a left-to-right topology with no skip state and the output distributions are represented as mixture of Gaussians with diagonal covariance matrices.…”
Section: Hmm Speaker Verificationmentioning
confidence: 99%
“…The selected model is with a left-to-right topology with no skip state and the output distributions are represented as mixture of Gaussians with diagonal covariance matrices. The HMM training is carried out by well-known Baum-Welch Algorithm [10,27]. In the verification are used the individual speaker's thresholds.…”
Section: Hmm Speaker Verificationmentioning
confidence: 99%
See 1 more Smart Citation
“…Each context-dependent unit is typically represented by a hidden Markov model (HMM) with Gaussian mixture observation densities, which account for the remaining acoustic variation among different instances of the same unit. For further details about the architecture of standard HMM-based recognizers, see [4]. 1 Linguists distinguish phones-acoustic realizations of speech sounds-from phonemes-abstract sound units, each possibly corresponding to multiple phones, such that a change in a single phoneme can change a word's identity.…”
Section: B Phones and Context-dependent Phonesmentioning
confidence: 99%
“…Typical sub-phonetic features are articulatory features, which may be binary or multivalued and characterize in some way the configuration of the vocal tract. 4 Roughly 80% of phonetic substitutions of consonants in the Switchboard Transcription Project data consist of a single articulatory feature change [10]. In addition, effects such as nasalization, rounding, and stop consonant epenthesis can be the result of asynchrony between articulatory trajectories [50].…”
Section: Sub-phonetic Feature Modelsmentioning
confidence: 99%