Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-long.39
|View full text |Cite
|
Sign up to set email alerts
|

Explaining Contextualization in Language Models using Visual Analytics

Abstract: Despite the success of contextualized language models on various NLP tasks, it is still unclear what these models really learn. In this paper, we contribute to the current efforts of explaining such models by exploring the continuum between function and content words with respect to contextualization in BERT, based on linguistically-informed insights. In particular, we utilize scoring and visual analytics techniques: we use an existing similarity-based score to measure contextualization and integrate it into a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(11 citation statements)
references
References 42 publications
0
11
0
Order By: Relevance
“…43 More specifically, this means that many words will need to have several different embeddings so that a context-dependent choice can be made for each situation. 44 To achieve this, the use of deep learning models is a popular choice, and approaches have, for example, been developed for recursive neural networks, 45 convolutional neural networks, 46 and recurrent neural networks. 47 Arguably, the current state-of-the-art technology for text embedding is the Universal Sentence Encoder (USE), 48 but the previously mentioned BERT algorithm also works for text of sentence length.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…43 More specifically, this means that many words will need to have several different embeddings so that a context-dependent choice can be made for each situation. 44 To achieve this, the use of deep learning models is a popular choice, and approaches have, for example, been developed for recursive neural networks, 45 convolutional neural networks, 46 and recurrent neural networks. 47 Arguably, the current state-of-the-art technology for text embedding is the Universal Sentence Encoder (USE), 48 but the previously mentioned BERT algorithm also works for text of sentence length.…”
Section: Related Workmentioning
confidence: 99%
“…43 More specifically, this means that many words will need to have several different embeddings so that a context-dependent choice can be made for each situation. 44 To achieve this, the Figure 1. Using the EEVO tool to visualize the performance of embedding-based ensembles conducting text similarity calculations on a large set of scientific publications (see further Section Visualization).…”
Section: Word and Text Embeddingmentioning
confidence: 99%
“…Most common explainability techniques either use supervised probing methods, i.e., linear classification models predicting specific linguistic properties (e.g., [Eth19]), or apply adversarial testing to conclude about models' capability of learning specific context properties (e.g., [MPL19]). However, the findings of these two strands of research are often contradictory [SKB∗21]. At the same time, visual analytics approaches are used for the explainability of embedding contextualization.…”
Section: Introductionmentioning
confidence: 99%
“…A further, main contribution of this paper is an interactive explanation workspace that visualizes the computed score values. The visual representation of the scores is crucial due to the huge amount of data that is generated and has to be investigated, and because the embedding contextualization differs depending on the token's role (e.g., meaning or function) in its context [SKB∗21]. Visualizations are effective means for generating insights into such (complex) data patterns [KAF∗08].…”
Section: Introductionmentioning
confidence: 99%
“…Contextualized LMs are usually pre-trained on a language modeling task (e.g., next word prediction) and are used as transfer-learning methods in other NLP tasks [64]. Adaptation to tasks is typically carried out through fine-tuning of the model, or part of it, on domain-specific data.…”
Section: From Text To Vectorsmentioning
confidence: 99%