Top-down Tree Long Short-Term Memory Networks

Zhang, Xingxing; Lu, Liang; Lapata, Mirella

doi:10.18653/v1/n16-1035

Cited by 56 publications

(39 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The architecture of tree LSTMs varies depending on the task. Some options include summarizing over the children, adding a separate forget gate for each child (Tai et al, 2015), recurrent propagation among siblings (Zhang et al, 2016), or use of stack LSTMs (Dyer et al, 2015). Our work differs from these studies in two respects: the tree structure here characterizes a discussion rather than a single sentence; 7 3 7 7 that is terrifying.…”

Section: Related Workmentioning

confidence: 99%

Conversation Modeling on Reddit Using a Graph-Structured LSTM

Zayats

Ostendorf

2018

TACL

View full text Add to dashboard Cite

This paper presents a novel approach for modeling threaded discussions on social media using a graph-structured bidirectional LSTM which represents both hierarchical and temporal conversation structure. In experiments with a task of predicting popularity of comments in Reddit discussions, the proposed model outperforms a node-independent architecture for different sets of input features. Analyses show a benefit to the model over the full course of the discussion, improving detection in both early and late stages. Further, the use of language cues with the bidirectional tree state updates helps with identifying controversial comments.

show abstract

Section: Related Workmentioning

confidence: 99%

Conversation Modeling on Reddit Using a Graph-Structured LSTM

Zayats

Ostendorf

2018

TACL

View full text Add to dashboard Cite

show abstract

“…Another drawback is that they needed special tokens for predicting branches, which are not necessary for MWPs because all operators are binary operators. The similar framework is also used in code generation (Zhang et al, 2016;Yin and Neubig, 2017). Alvarez-Melis and Jaakkola (2017) presented doubly recurrent neural networks to predict tree topology explicitly.…”

Section: Seq2tree Architecturesmentioning

confidence: 99%

Tree-structured Decoding for Solving Math Word Problems

Liu¹,

Guan²,

Li³

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

View full text Add to dashboard Cite

Automatically solving math word problems is an interesting research topic that needs to bridge natural language descriptions and formal math equations. Previous studies introduced end-to-end neural network methods, but these approaches did not efficiently consider an important characteristic of the equation, i.e., an abstract syntax tree. To address this problem, we propose a tree-structured decoding method that generates the abstract syntax tree of the equation in a top-down manner. In addition, our approach can automatically stop during decoding without a redundant stop token. The experimental results show that our method achieves single model state-of-the-art performance on Math23K, which is the largest dataset on this task.

show abstract

“…Kementchedjhieva and Lopez (2018) found a single unit in their English CNLM that seems, qualitatively, to be tracking morpheme/word boundaries. Since they trained the model with whitespace, the main function of this unit could simply (Mikolov, 2012), Word RNN (Zweig et al, 2012), Word LSTM and LdTreeLSTM (Zhang et al, 2016). We further report models incorporating distributional encodings of semantics (right): Skipgram(+RNNs) from Mikolov et al (2013a), the PMI-based model of Woods (2016), and the Context-Embedding-based approach of Melamud et al (2016). be to predict the very frequent whitespace character.…”

Section: Boundary Tracking In Cnlmsmentioning

confidence: 99%

TabulaNearlyRasa:Probing the Linguistic Knowledge of Character-level Neural Language Models Trained on Unsegmented Text

Hahn

Baroni

2019

Transactions of the Association for Computational Linguistics

View full text Add to dashboard Cite

Recurrent neural networks (RNNs) have reached striking performance in many natural language processing tasks. This has renewed interest in whether these generic sequence processing devices are inducing genuine linguistic knowledge. Nearly all current analytical studies, however, initialize the RNNs with a vocabulary of known words, and feed them tokenized input during training. We present a multi-lingual study of the linguistic knowledge encoded in RNNs trained as character-level language models, on input data with word boundaries removed. These networks face a tougher and more cognitively realistic task, having to discover any useful linguistic unit from scratch based on input statistics. The results show that our "near tabula rasa" RNNs are mostly able to solve morphological, syntactic and semantic tasks that intuitively presuppose word-level knowledge, and indeed they learned, to some extent, to track word boundaries. Our study opens the door to speculations about the necessity of an explicit, rigid word lexicon in language learning and usage. * Work partially done while interning at Facebook AI Research. 2 Our input data, test sets and pre-trained models are available at https://github.com/m-hahn/ tabula-rasa-rnns.3 Single-word utterances are not uncommon in childdirected language, but they are still rather the exception than the rule, and many important words, such as determiners, never occur in isolation (Christiansen et al., 2005).

show abstract

Top-down Tree Long Short-Term Memory Networks

Cited by 56 publications

References 28 publications

Conversation Modeling on Reddit Using a Graph-Structured LSTM

Conversation Modeling on Reddit Using a Graph-Structured LSTM

Tree-structured Decoding for Solving Math Word Problems

TabulaNearlyRasa:Probing the Linguistic Knowledge of Character-level Neural Language Models Trained on Unsegmented Text

Contact Info

Product

Resources

About