2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) 2016
DOI: 10.1109/iscslp.2016.7918423
|View full text |Cite
|
Sign up to set email alerts
|

On training bi-directional neural network language model with noise contrastive estimation

Abstract: We propose to train bi-directional neural network language model(NNLM) with noise contrastive estimation (NCE). Experiments are conducted on a rescore task on the PTB data set. It is shown that NCE-trained bi-directional NNLM outperformed the one trained by conventional maximum likelihood training. But still(regretfully), it did not out-perform the baseline uni-directional NNLM. IntroductionRecent years have witnessed exciting performance improvements in the field of language modeling, largely due to introduct… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
12
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 20 publications
(13 citation statements)
references
References 19 publications
1
12
0
Order By: Relevance
“…In contrast, by using both their scores in the DDM (model 8), we can steadily reduce the WER from when using only one of their scores (model 6 or 7). This result confirms that the DDM can effectively exploit complementary features and encourages us to use other types of LSTMLM scores in addition [27][28][29][30][31].…”
Section: Experimental Settingssupporting
confidence: 67%
See 2 more Smart Citations
“…In contrast, by using both their scores in the DDM (model 8), we can steadily reduce the WER from when using only one of their scores (model 6 or 7). This result confirms that the DDM can effectively exploit complementary features and encourages us to use other types of LSTMLM scores in addition [27][28][29][30][31].…”
Section: Experimental Settingssupporting
confidence: 67%
“…We have improved our DDM for rescoring N -best speech recognition hypothesis lists by using the backward LSTMLM score and ensemble encoders. Future work will include the use of other types of LSTMLM scores [27][28][29][30][31] and the use of a mixture-of-experts framework [36,37]. We also plan to compare our DDM with DLMs [16,17] and apply it to rescoring N -best machine translation hypothesis lists [41,42].…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Several previous work [10,11,12] have investigated the possibility to define a "bi-directional language model", by directly replacing the uni-directional LSTM for the conditional probability…”
Section: Related Work and Motivationmentioning
confidence: 99%
“…A simple language model is an ngram [1]. In recent years, recurrent neural network language models (RNNLMs) have consistently surpassed traditional ngrams in ASR and related tasks [2,3,4,5,6].…”
Section: Introductionmentioning
confidence: 99%