International Conference on Acoustics, Speech, and Signal Processing
DOI: 10.1109/icassp.1990.115728
|View full text |Cite
|
Sign up to set email alerts
|

TDNN labeling for a HMM recognizer

Abstract: Neural net speech recognizers, particularly multilayer perceptrons, have been successfully applied to tasks of speaker independent speech recognition. They are able to characterize well short term spectral changes which are relevant to consonant recognition. The TDNN (Time Delay Neural Network) uses a fixed time scale and is therefore well suited to model phenomena with fixed transition times. On the other hand these structures clearly miss time warping abilities which are equally crucial to a successful speec… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 9 publications
(2 citation statements)
references
References 4 publications
0
2
0
Order By: Relevance
“…Strictly related to the idea of using ANNs as vector quantizers for discrete HMMs is the concept of neural labelers [77,17]. In its basic version, this system is based on an MLP (or TDNN), fed with input acoustic features, with an output unit per each phonetic class.…”
Section: Network As Vector Quantizers For Discrete Hmmmentioning
confidence: 99%
See 1 more Smart Citation
“…Strictly related to the idea of using ANNs as vector quantizers for discrete HMMs is the concept of neural labelers [77,17]. In its basic version, this system is based on an MLP (or TDNN), fed with input acoustic features, with an output unit per each phonetic class.…”
Section: Network As Vector Quantizers For Discrete Hmmmentioning
confidence: 99%
“…It is a modular combination of TDNNs, where the nets are trained as phoneme classi"ers by BP on a training set labeled by hand. TDNNs internal representations (activations of the second hidden layer) of the acoustic features are used as inputs for a discrete HMM, the latter being separately trained using standard algorithms SI ("ve males for training, two males for test), isolated Korean words (75 words dictionary) recognition task: 44.9% relative WER reduction with respect to the standard discrete HMM Neural labelers [77,17] An MLP (or TDNN) with an output unit per each phonetic class is trained to yield output values equal to 0 or 1, according to the membership of the input observation to the class or not. The output from the MLP is passed on to a standard discrete HMM that uses it as a class label within a Viterbi decoding paradigm.…”
Section: Modelmentioning
confidence: 99%