Proceedings of the 2019 Conference of the North 2019
DOI: 10.18653/v1/n19-1329
|View full text |Cite
|
Sign up to set email alerts
|

Understanding Learning Dynamics Of Language Models with

Abstract: Research has shown that neural models implicitly encode linguistic features, but there has been no research showing how these encodings arise as the models are trained. We present the first study on the learning dynamics of neural language models, using a simple and flexible analysis method called Singular Vector Canonical Correlation Analysis (SVCCA), which enables us to compare learned representations across time and across models, without the need to evaluate directly on annotated data. We probe the evoluti… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
82
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 71 publications
(83 citation statements)
references
References 22 publications
1
82
0
Order By: Relevance
“…The average of {ρ 1 , ... ρ m } is often used as an overall similarity measure, as in related work exploring multilingual representations in neural machine translation systems (Kudugunta et al, 2019) and language models (Saphra and Lopez, 2018). Morcos et al (2018) show that in studying recurrent and convolutional networks, replacing a weighted average leads to a more robust measure of similarity between two sets of activations.…”
Section: Ccamentioning
confidence: 99%
“…The average of {ρ 1 , ... ρ m } is often used as an overall similarity measure, as in related work exploring multilingual representations in neural machine translation systems (Kudugunta et al, 2019) and language models (Saphra and Lopez, 2018). Morcos et al (2018) show that in studying recurrent and convolutional networks, replacing a weighted average leads to a more robust measure of similarity between two sets of activations.…”
Section: Ccamentioning
confidence: 99%
“…Selecting probing tasks that might tell allow us to better interpret cross-lingual modelling is another logical path one might follow. On a similar theme, an interesting research direction also involve adaptations of simple probing tasks describing linguistic phenomena to specialised architectures, for better comparison using SVCCA-style analyses (Saphra and Lopez, 2018).…”
Section: Discussionmentioning
confidence: 99%
“…A.1.2 SVCCA Singular Vector Canonical Correlation Analysis (SVCCA) is a general method to compare the correlation of two sets of vector representations. SVCCA has been proposed to compare learned representations across language models (Saphra and Lopez, 2018). Here we adopt SVCCA to measure the linear similarity of two sets of representations in the same multi-BERT from different translated datasets, which are parallel to each other.…”
Section: A11 Cosine Similaritymentioning
confidence: 99%