Evaluating Compositionality of Sentence Representation Models

Bhathena, Hanoz; Willis, Angelica; Dass, Nathan

doi:10.18653/v1/2020.repl4nlp-1.22

Cited by 4 publications

(6 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…When employing the metric, one should define an appropriate distance function (δ) and define fη parametrised by η. Andreas illustrates the TRE's versatility by instantiating it for three scenarios: to investigate whether image representations are similar to composed image attributes, whether phrase embeddings are similar to the vector addition of their components, and whether generalisation accuracy in a reference game positively correlates with TRE. Bhathena et al (2020) present two methods based on TRE to obtain compositionality ratings for sentiment trees, referred to as tree impurity and weighted node switching that express the difference between the sentiment label of the root and the other nodes in the tree. Zheng and Jiang (2022) ranked examples of sentiment analysis based on the extent to which neural models should memorise examples in order to capture their target correctly.…”

Section: Related Workmentioning

confidence: 99%

Recursive Neural Networks with Bottlenecks Diagnose (Non-)Compositionality

Dankers¹,

Titov²

2023

Preprint

View full text Add to dashboard Cite

A recent line of work in NLP focuses on the (dis)ability of models to generalise compositionally for artificial languages. However, when considering natural language tasks, the data involved is not strictly, or locally, compositional. Quantifying the compositionality of data is a challenging task, which has been investigated primarily for short utterances. We use recursive neural models (Tree-LSTMs) with bottlenecks that limit the transfer of information between nodes. We illustrate that comparing data's representations in models with and without the bottleneck can be used to produce a compositionality metric. The procedure is applied to the evaluation of arithmetic expressions using synthetic data, and sentiment classification using natural language data. We demonstrate that compression through a bottleneck impacts non-compositional examples disproportionately and then use the bottleneck compositionality metric (BCM) to distinguish compositional from non-compositional samples, yielding a compositionality ranking over a dataset.

show abstract

Section: Related Workmentioning

confidence: 99%

Recursive Neural Networks with Bottlenecks Diagnose (Non-)Compositionality

Dankers¹,

Titov²

2023

Preprint

View full text Add to dashboard Cite

show abstract

“…Firstly, existing work derive a single embedding for the entire query. The problem of representing longer text inputs is being actively researched in the community and remains an open problem [7,8,6]. This means that specific details or nested subqueries of the query may be omitted or not represented properly -getting lost in the embedding.…”

Section: Neural Models For Semantic Code Searchmentioning

confidence: 99%

“…Despite impressive results on SCS, current neural approaches are far from satisfactory in dealing with a wide range of natural-language queries, especially on ones with compositional language structure. Encoding longer text into a dense vector is an open problem for neural language models, as neural networks are not believed to be extracting systematic rules from data [7,8,6]. Not only does this a) affect the performance, but it can b) drastically reduce a model's value for the users, because compositional queries such as "Check that directory does not exist before creating it" require performing multi-step reasoning on code.…”

Section: Introductionmentioning

confidence: 99%

NS3: Neuro-Symbolic Semantic Code Search

Arakelyan¹,

Hakhverdyan²,

Allamanis³

et al. 2022

Preprint

View full text Add to dashboard Cite

Semantic code search is the task of retrieving a code snippet given a textual description of its functionality. Recent work has been focused on using similarity metrics between neural embeddings of text and code. However, current language models are known to struggle with longer, compositional text, and multi-step reasoning.To overcome this limitation, we propose supplementing the query sentence with a layout of its semantic structure. The semantic layout is used to break down the final reasoning decision into a series of lower-level decisions. We use a Neural Module Network architecture to implement this idea. We compare our model -NS 3 (Neuro-Symbolic Semantic Search) -to a number of baselines, including state-of-the-art semantic code retrieval methods, and evaluate on two datasets -CodeSearchNet and Code Search and Question Answering. We demonstrate that our approach results in more precise code retrieval, and we study the effectiveness of our modular design when handling compositional queries 1 .Preprint. Under review.

show abstract

“…Phrase and sentence composition has drawn frequent attention in analysis of neural models, often focusing on analysis of internal representations and downstream task behavior (Ettinger et al, 2018;Conneau et al, 2019;Nandakumar et al, 2019;Yu and Ettinger, 2020;Bhathena et al, 2020;Mu and Andreas, 2020;Andreas, 2019). Some work investigates compositionality via constructing linguistic (Keysers et al, 2019) and non-linguistic (Liška et al, 2018;Hupkes et al, 2018;Baan et al, 2019) and Ettinger (2020).…”

Section: Related Workmentioning

confidence: 99%

On the Interplay Between Fine-tuning and Composition in Transformers

Yu¹,

Ettinger²

2021

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

View full text Add to dashboard Cite

Pre-trained transformer language models have shown remarkable performance on a variety of NLP tasks. However, recent research has suggested that phrase-level representations in these models reflect heavy influences of lexical content, but lack evidence of sophisticated, compositional phrase information (Yu and Ettinger, 2020). Here we investigate the impact of fine-tuning on the capacity of contextualized embeddings to capture phrase meaning information beyond lexical content. Specifically, we fine-tune models on an adversarial paraphrase classification task with high lexical overlap, and on a sentiment classification task. After fine-tuning, we analyze phrasal representations in controlled settings following prior work. We find that fine-tuning largely fails to benefit compositionality in these representations, though training on sentiment yields a small, localized benefit for certain models. In follow-up analyses, we identify confounding cues in the paraphrase dataset that may explain the lack of composition benefits from that task, and we discuss potential factors underlying the localized benefits from sentiment training.

show abstract

Evaluating Compositionality of Sentence Representation Models

Cited by 4 publications

References 14 publications

Recursive Neural Networks with Bottlenecks Diagnose (Non-)Compositionality

Recursive Neural Networks with Bottlenecks Diagnose (Non-)Compositionality

NS3: Neuro-Symbolic Semantic Code Search

On the Interplay Between Fine-tuning and Composition in Transformers

Contact Info

Product

Resources

About