1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258) 1999
DOI: 10.1109/icassp.1999.758074
|View full text |Cite
|
Sign up to set email alerts
|

Initial evaluation of hidden dynamic models on conversational speech

Abstract: Conversational speech recognition is a challenging problem primarily because speakers rarely fully articulate sounds. A successful speech recognition approach must infer intended spectral targets from the speech data, or develop a method of dealing with large variances in the data. Hidden Dynamic Models (HDMs) attempt to automatically learn such targets in a hidden feature space using models that integrate linguistic information with constrained temporal trajectory models. HDMs are a radical departure from con… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2000
2000
2018
2018

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 31 publications
(14 citation statements)
references
References 3 publications
0
14
0
Order By: Relevance
“…These trajectories are connected to the surface acoustics by a single MLP. For an N-best rescoring task on the Switchboard corpus and a baseline WER of 48.2% from a standard HMM system, 5-best rescoring with the reference transcription included 2 using the HDM gave a reduced error rate of 34.7% (Picone et al, 1999). An identical rescoring experiment using an HMM trained on the data used to build the HDM gave a word error rate of 44.8%.…”
Section: Continuous Articulatory Internal Representationmentioning
confidence: 99%
“…These trajectories are connected to the surface acoustics by a single MLP. For an N-best rescoring task on the Switchboard corpus and a baseline WER of 48.2% from a standard HMM system, 5-best rescoring with the reference transcription included 2 using the HDM gave a reduced error rate of 34.7% (Picone et al, 1999). An identical rescoring experiment using an HMM trained on the data used to build the HDM gave a word error rate of 44.8%.…”
Section: Continuous Articulatory Internal Representationmentioning
confidence: 99%
“…Similar trajectory HMMs also form the basis for parametric speech synthesis [36,37]. Subsequent work added new hidden layers into the dynamic model, thus being deep, to explicitly account for the target-directed, articulatorylike properties in human speech generation [11][12][13][38][39][40][41][42][43][44][45]. More efficient implementation of this deep architecture with hidden dynamics was achieved with nonrecursive or finite impulse response filters in more recent studies [46].…”
Section: A) a Selected Review Of Deep Generative Models Of Speech Primentioning
confidence: 99%
“…Various models for speech recognition have been developed recently with the aim of capturing speech dynamics [1] [2][3] [4][5] [6]. However, the theoretical backgrounds of these methods are so widely varied that they seem to be completely different from each other.…”
Section: Introductionmentioning
confidence: 99%