Dependency Recurrent Neural Language Models for Sentence Completion

Mirowski, Piotr; Vlachos, Andreas

doi:10.3115/v1/p15-2084

Cited by 36 publications

(28 citation statements)

References 20 publications

(25 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They tested a variety of language modeling approaches using their task, and report that well-trained generative n-gram models achieve correct predictions ≈ 30% of the time. State-of-the-art performance on the their word prediction task using recurrent neural network langage models, 6 report highest scores are in the mid-50% range (Mirowski and Vlachos, 2015;Mnih and Kavukcuoglu, 2013).…”

Section: Related Work and Discussionmentioning

confidence: 99%

Target word prediction and paraphasia classification in spoken discourse

Adams¹,

Bedrick²,

Fergadiotis³

et al. 2017

BioNLP 2017

View full text Add to dashboard Cite

We present a system for automatically detecting and classifying phonologically anomalous productions in the speech of individuals with aphasia. Working from transcribed discourse samples, our system identifies neologisms, and uses a combination of string alignment and language models to produce a lattice of plausible words that the speaker may have intended to produce. We then score this lattice according to various features, and attempt to determine whether the anomalous production represented a phonemic error or a genuine neologism. This approach has the potential to be expanded to consider other types of paraphasic errors, and could be applied to a wide variety of screening and therapeutic applications.

show abstract

Section: Related Work and Discussionmentioning

confidence: 99%

Target word prediction and paraphasia classification in spoken discourse

Adams¹,

Bedrick²,

Fergadiotis³

et al. 2017

BioNLP 2017

View full text Add to dashboard Cite

show abstract

“…Neural network language models [22,24,43] have attracted a lot of attention recently given their dense and learnable representation form and generalization property, as a contrast to the traditional bag-of-words representations. Word2vec skip-gram [23] (cf.…”

Section: Related Workmentioning

confidence: 99%

Improving Negative Sampling for Word Representation using Self-embedded Features

Chen

Yuan

Jose

et al. 2018

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

View full text Add to dashboard Cite

Although the word-popularity based negative sampler has shown superb performance in the skip-gram model, the theoretical motivation behind oversampling popular (non-observed) words as negative samples is still not well understood. In this paper, we start from an investigation of the gradient vanishing issue in the skip-gram model without a proper negative sampler. By performing an insightful analysis from the stochastic gradient descent (SGD) learning perspective, we demonstrate that, both theoretically and intuitively, negative samples with larger inner product scores are more informative than those with lower scores for the SGD learner in terms of both convergence rate and accuracy. Understanding this, we propose an alternative sampling algorithm that dynamically selects informative negative samples during each SGD update. More importantly, the proposed sampler accounts for multi-dimensional self-embedded features during the sampling process, which essentially makes it more effective than the original popularity-based (one-dimensional) sampler. Empirical experiments further verify our observations, and show that our fine-grained samplers gain significant improvement over the existing ones without increasing computational complexity.

show abstract

“…The best performing LSTM is worse than a LDTREEL-STM (d = 300). The input and output embeddings (W e and W ho ) dominate the number of parameters in all neural models except for RNNME, depRNN+3gram and ldepRNN+4gram, which include a ME model that contains 1 billion sparse n-gram features (Mikolov, 2012;Mirowski and Vlachos, 2015). The number of parameters in TREELSTM and LDTREELSTM is not much larger compared to LSTM due to the tied W e and W ho matrices.…”

Section: Microsoft Sentence Completion Challengementioning

confidence: 99%

“…Emami et al (2003) and Sennrich (2015) estimate the parameters of a structured language model using feed-forward neural networks (Bengio et al, 2003). Mirowski and Vlachos (2015) re-implement the model of Gubbins and Vlachos (2013) with RNNs. They view sentences as sequences of words over a tree.…”

Section: Introductionmentioning

confidence: 99%

Top-down Tree Long Short-Term Memory Networks

Zhang¹,

Lu²,

Lapata³

2016

Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

Long Short-Term Memory (LSTM) networks, a type of recurrent neural network with a more complex computational unit, have been successfully applied to a variety of sequence modeling tasks. In this paper we develop Tree Long Short-Term Memory (TREELSTM), a neural network model based on LSTM, which is designed to predict a tree rather than a linear sequence. TREELSTM defines the probability of a sentence by estimating the generation probability of its dependency tree. At each time step, a node is generated based on the representation of the generated subtree. We further enhance the modeling power of TREELSTM by explicitly representing the correlations between left and right dependents. Application of our model to the MSR sentence completion challenge achieves results beyond the current state of the art. We also report results on dependency parsing reranking achieving competitive performance.

show abstract

Dependency Recurrent Neural Language Models for Sentence Completion

Cited by 36 publications

References 20 publications

Target word prediction and paraphasia classification in spoken discourse

Target word prediction and paraphasia classification in spoken discourse

Improving Negative Sampling for Word Representation using Self-embedded Features

Top-down Tree Long Short-Term Memory Networks

Contact Info

Product

Resources

About