Connectionist language modeling for large vocabulary continuous speech recognition

Schwenk,; Gauvain,

doi:10.1109/icassp.2002.1005852

Cited by 48 publications

(30 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…This formulation applies to a discriminant variant of the RBM called Discriminative RBM . Such conditional energy-based models have also been exploited in a series of probabilistic language models based on neural networks (Bengio et al, 2001;Schwenk & Gauvain, 2002;Bengio, Ducharme, Vincent, & Jauvin, 2003;Xu, Emami, & Jelinek, 2003;Schwenk, 2004;Schwenk & Gauvain, 2005;Mnih & Hinton, 2009). That formulation (or generally when it is easy to sum or maximize over the set of values of the terms of the partition function) has been explored at length (LeCun & Huang, 2005;LeCun et al, 2006;).…”

Section: Conditional Energy-based Modelsmentioning

confidence: 99%

“…The idea of distributed representation is an old idea in machine learning and neural networks research (Hinton, 1986;Rumelhart et al, 1986a;Miikkulainen & Dyer, 1991;Bengio, Ducharme, & Vincent, 2001;Schwenk & Gauvain, 2002), and it may be of help in dealing with the curse of dimensionality and the limitations of local generalization. A cartoon local representation for integers i ∈ {1, 2, .…”

Section: Learning Distributed Representationsmentioning

confidence: 99%

See 1 more Smart Citation

Learning Deep Architectures for AI

Bengio

2009

FNT in Machine Learning

6,621

3,465

View full text Add to dashboard Cite

Theoretical results suggest that in order to learn the kind of complicated functions that can represent highlevel abstractions (e.g. in vision, language, and other AI-level tasks), one may need deep architectures. Deep architectures are composed of multiple levels of non-linear operations, such as in neural nets with many hidden layers or in complicated propositional formulae re-using many sub-formulae. Searching the parameter space of deep architectures is a difficult task, but learning algorithms such as those for Deep Belief Networks have recently been proposed to tackle this problem with notable success, beating the state-of-the-art in certain areas. This paper discusses the motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer models such as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks.

show abstract

Section: Conditional Energy-based Modelsmentioning

confidence: 99%

Section: Learning Distributed Representationsmentioning

confidence: 99%

Learning Deep Architectures for AI

Bengio

2009

FNT in Machine Learning

6,621

3,465

View full text Add to dashboard Cite

show abstract

“…ing (see e. g. [3], [4], and [5]). However, there are fundamental differences in the way neural networks have previously been applied to speech recognition tasks.…”

Section: Introductionmentioning

confidence: 99%

Comparison of feedforward and recurrent neural network language models

Sundermeyer

Oparin

Gauvain

et al. 2013

2013 IEEE International Conference on Acoustics, Speech and Signal Processing

123

View full text Add to dashboard Cite

Research on language modeling for speech recognition has increasingly focused on the application of neural networks. Two competing concepts have been developed: On the one hand, feedforward neural networks representing an ngram approach, on the other hand recurrent neural networks that may learn context dependencies spanning more than a fixed number of predecessor words.To the best of our knowledge, no comparison has been carried out between feedforward and state-of-the-art recurrent networks when applied to speech recognition. This paper analyzes this aspect in detail on a well-tuned French speech recognition task. In addition, we propose a simple and efficient method to normalize language model probabilities across different vocabularies, and we show how to speed up training of recurrent neural networks by parallelization.

show abstract

“…So one straightforward solution to make the network work faster is to reduce the output vocabulary size. For example in word error rate (WER) experiments the output vocabulary can be limited to a certain number of most frequent words, which would be a fraction of the actual vocabulary (Schwenk & Gauvain, 2002). Both the training and evaluation time are reduced proportionally with the reduction in output vocabulary size.…”

Section: Vocabulary Limitationmentioning

confidence: 99%

A Neural Syntactic Language Model

Emami

Jelinek

2005

Mach Learn

View full text Add to dashboard Cite

Abstract. This paper presents a study of using neural probabilistic models in a syntactic based language model. The neural probabilistic model makes use of a distributed representation of the items in the conditioning history, and is powerful in capturing long dependencies. Employing neural network based models in the syntactic based language model enables it to use efficiently the large amount of information available in a syntactic parse in estimating the next word in a string. Several scenarios of integrating neural networks in the syntactic based language model are presented, accompanied by the derivation of the training procedures involved. Experiments on the UPenn Treebank and the Wall Street Journal corpus show significant improvements in perplexity and word error rate over the baseline SLM. Furthermore, comparisons with the standard and neural net based N-gram models with arbitrarily long contexts show that the syntactic information is in fact very helpful in estimating the word string probability. Overall, our neural syntactic based model achieves the best published results in perplexity and WER for the given data sets.

show abstract

Connectionist language modeling for large vocabulary continuous speech recognition

Cited by 48 publications

References 4 publications

Learning Deep Architectures for AI

Learning Deep Architectures for AI

Comparison of feedforward and recurrent neural network language models

A Neural Syntactic Language Model

Contact Info

Product

Resources

About