Interspeech 2016 2016
DOI: 10.21437/interspeech.2016-422
|View full text |Cite
|
Sign up to set email alerts
|

Sequential Recurrent Neural Networks for Language Modeling

Abstract: Feedforward Neural Network (FNN)-based language models estimate the probability of the next word based on the history of the last N words, whereas Recurrent Neural Networks (RNN) perform the same task based only on the last word and some context information that cycles in the network. This paper presents a novel approach, which bridges the gap between these two categories of networks. In particular, we propose an architecture which takes advantage of the explicit, sequential enumeration of the word history in … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
5
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
2
1

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(7 citation statements)
references
References 15 publications
2
5
0
Order By: Relevance
“…This conclusion shows, similarly to other work e.g. [15,13], that recurrent models can be further improved using N-gram/feedforward information, given that they model different linguistic features. Fig.…”
Section: Ptb Experimentssupporting
confidence: 85%
See 2 more Smart Citations
“…This conclusion shows, similarly to other work e.g. [15,13], that recurrent models can be further improved using N-gram/feedforward information, given that they model different linguistic features. Fig.…”
Section: Ptb Experimentssupporting
confidence: 85%
“…This category typically leads to a significant increase in the number of parameters when combining multiple models. In a first attempt to circumvent these problems, we have recently proposed an SRNN model [15], which combines FFN information and RNN through additional sequential connections at the hidden layer. Although SRNN was successful and did not noticeably suffer from the aforementioned problems, it was solely designed to combine RNN and FNN and is, therefore, not well-suited for other architectures.…”
Section: Model Combination For Language Modelingmentioning
confidence: 99%
See 1 more Smart Citation
“…However, the likelihood of a word is also determined by linguistic material outside the ngram window, that is, in a preceding sentence, or even by extralinguistic context. Effects of linguistic expressions in prior discourse can be taken into account with more recent and advanced language modeling techniques [12][13][14][15][16][17][18], but since these models are trained on text corpora too, they do not take into account extralinguistic context. Modeling effects of extralinguistic context is particularly important in absence of linguistic context, i.e.…”
Section: Introductionmentioning
confidence: 99%
“…The LTCB data split and processing is the same as the one used in [19,20]. In particular, the LTCB vocabulary is limited to the 80K most frequent words with all remaining words replaced by <unk>.…”
Section: Experiments and Resultsmentioning
confidence: 99%