Unsupervised Recurrent Neural Network Grammars

Kim, Yoon; Rush, Alexander M.; Yu, Lei; Kuncoro, Adhiguna; Dyer, Chris; Melis, Gábor

doi:10.48550/arxiv.1904.03746

Cited by 11 publications

(23 citation statements)

References 61 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…RL-trained models are finetuned from XEtrained model, optimizing CIDEr score with REIN-FORCE (Williams, 1992). The baseline used RE-INFORCE algorithm follows (Mnih and Rezende, 2016;Kim et al, 2019). This baseline performs better than self critical baseline in (Rennie et al, 2017).…”

Section: Appendices a Training Detailsmentioning

confidence: 99%

Analysis of diversity-accuracy tradeoff in image captioning

Luo¹,

Shakhnarovich²

2020

Preprint

View full text Add to dashboard Cite

We investigate the effect of different model architectures, training objectives, hyperparameter settings and decoding procedures on the diversity of automatically generated image captions. Our results show that 1) simple decoding by naive sampling, coupled with low temperature is a competitive and fast method to produce diverse and accurate caption sets; 2) training with CIDEr-based reward using Reinforcement learning harms the diversity properties of the resulting generator, which cannot be mitigated by manipulating decoding parameters. In addition, we propose a new metric AllSPICE for evaluating both accuracy and diversity of a set of captions by a single value.

show abstract

Section: Appendices a Training Detailsmentioning

confidence: 99%

Analysis of diversity-accuracy tradeoff in image captioning

Luo¹,

Shakhnarovich²

2020

Preprint

View full text Add to dashboard Cite

show abstract

“…It is noteworthy that there are many other important miscellaneous works we do not mention in the previous sections. For example, numerous works have proposed to improve upon vanilla gradient-based methods [174,178,65]; linguistic rules such as negation, morphological inflection can be extracted by neural models [141,142,158]; probing tasks can used to explore linguistic properties of sentences [3,80,43,75,89,74,34]; the hidden state dynamics in recurrent nets are analysed to illuminate the learned long-range dependencies [73,96,67,179,94]; [169,166,168,101,57,167] studied the ability of neural sequence models to induce lexical, grammatical and syntactic structures; [91,90,12,136,159,24,151,85] modeled the reasoning process of the model to explain model behaviors; [157,139,28,163,219,170,180,137,106,58,162,81...…”

Section: Miscellaneousmentioning

confidence: 99%

Interpreting Deep Learning Models in Natural Language Processing: A Review

Sun¹,

Yang²,

Li³

et al. 2021

Preprint

View full text Add to dashboard Cite

Neural network models have achieved state-of-the-art performances in a wide range of natural language processing (NLP) tasks. However, a long-standing criticism against neural network models is the lack of interpretability, which not only reduces the reliability of neural NLP systems but also limits the scope of their applications in areas where interpretability is essential (e.g., health care applications). In response, the increasing interest in interpreting neural NLP models has spurred a diverse array of interpretation methods over recent years. In this survey, we provide a comprehensive review of various interpretation methods for neural models in NLP. We first stretch out a high-level taxonomy for interpretation methods in NLP, i.e., training-based approaches, test-based approaches and hybrid approaches. Next, we describe sub-categories in each category in detail, e.g., influence-function based methods, KNN-based methods, attention-based models, saliency-based methods, perturbation-based methods, etc. We point out deficiencies of current methods and suggest some avenues for future research.

show abstract

“…Most other work focuses on constituency parsing [4], [2], [3], [21]. These methods are also bottom-up, i.e., they can extract internal data structures without needing to understand the full context of the sentence.…”

Section: Related Workmentioning

confidence: 99%

RL-GRIT: Reinforcement Learning for Grammar Inference

Woods¹

2021

Preprint

View full text Add to dashboard Cite

When working to understand usage of a data format, examples of the data format are often more representative than the format's specification. For example, two different applications might use very different JSON representations, or two PDF-writing applications might make use of very different areas of the PDF specification to realize the same rendered content. The complexity arising from these distinct origins can lead to large, difficult-to-understand attack surfaces, presenting a security concern when considering both exfiltration and data schizophrenia. Grammar inference can aid in describing the practical language generator behind examples of a data format. However, most grammar inference research focuses on natural language, not data formats, and fails to support crucial features such as type recursion. We propose a novel set of mechanisms for grammar inference, RL-GRIT 1 , and apply them to understanding de facto data formats. After reviewing existing grammar inference solutions, it was determined that a new, more flexible scaffold could be found in Reinforcement Learning (RL). Within this work, we lay out the many algorithmic changes required to adapt RL from its traditional, sequential-time environment to the highly interdependent environment of parsing. The result is an algorithm which can demonstrably learn recursive control structures in simple data formats, and can extract meaningful structure from fragments of the PDF format. Whereas prior work in grammar inference focused on either regular languages or constituency parsing, we show that RL can be used to surpass the expressiveness of both classes, and offers a clear path to learning context-sensitive languages. The proposed algorithm can serve as a building block for understanding the ecosystems of de facto data formats.

show abstract

Unsupervised Recurrent Neural Network Grammars

Cited by 11 publications

References 61 publications

Analysis of diversity-accuracy tradeoff in image captioning

Analysis of diversity-accuracy tradeoff in image captioning

Interpreting Deep Learning Models in Natural Language Processing: A Review

RL-GRIT: Reinforcement Learning for Grammar Inference

Contact Info

Product

Resources

About