Proceedings of the 28th International Conference on Computational Linguistics 2020
DOI: 10.18653/v1/2020.coling-main.65
|View full text |Cite
|
Sign up to set email alerts
|

Linguistic Profiling of a Neural Language Model

Abstract: In this paper we investigate the linguistic knowledge learned by a Neural Language Model (NLM) before and after a fine-tuning process and how this knowledge affects its predictions during several classification problems. We use a wide set of probing tasks, each of which corresponds to a distinct sentence-level feature extracted from different levels of linguistic annotation. We show that BERT is able to encode a wide range of linguistic characteristics, but it tends to lose this information when trained on spe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
30
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4

Relationship

4
5

Authors

Journals

citations
Cited by 34 publications
(33 citation statements)
references
References 50 publications
3
30
0
Order By: Relevance
“…Overall, our results show that information about the complexity of a sentence is better encoded in its explicit linguistic features, thus its syntactic and morphosyntactic structures. On the other hand, although BERT has been proven to embed a wide range of linguistic properties, including syntactic ones Miaschi et al, 2020), our findings seem to suggest that this model does not exploit these kind of features to solve a downstream task like ours, for which few data are available. Indeed, it has been shown that BERT performs better on datasets larger than ours (Kumar et al, 2020).…”
Section: Predicting Sentence Complexitymentioning
confidence: 64%
See 1 more Smart Citation
“…Overall, our results show that information about the complexity of a sentence is better encoded in its explicit linguistic features, thus its syntactic and morphosyntactic structures. On the other hand, although BERT has been proven to embed a wide range of linguistic properties, including syntactic ones Miaschi et al, 2020), our findings seem to suggest that this model does not exploit these kind of features to solve a downstream task like ours, for which few data are available. Indeed, it has been shown that BERT performs better on datasets larger than ours (Kumar et al, 2020).…”
Section: Predicting Sentence Complexitymentioning
confidence: 64%
“…For example, Merchant et al (2020) found that fine-tuning does not impact heavily the linguistic information implicitly learned by the model, especially when considering a supervised probe closely related to a downstream task. Miaschi et al (2020) further demonstrated a positive correlation between the model's ability to solve a downstream task on a specific input sentence and the related linguistic knowledge encoded in a language model. Nonetheless, to our knowledge, no previous work has taken into account sentence complexity assessment as a fine-tuning task for NLMs.…”
Section: Discussionmentioning
confidence: 90%
“…Overall, our results show that information about the complexity of a sentence is better encoded in its explicit linguistic features, thus its syntactic and morphosyntactic structures. On the other hand, although BERT has been proven to embed a wide range of linguistic properties, including syntactic ones (Tenney et al, 2019;Miaschi et al, 2020), our findings seem to suggest that this model does not exploit these kind of features to solve a downstream task like ours, for which few data are available. Indeed, it has been shown that BERT performs better on datasets larger than ours (Kumar et al, 2020).…”
Section: Predicting Sentence Complexitymentioning
confidence: 66%
“…All these features have been shown to play a highly predictive role when leveraged by traditional learning models on a variety of classification problems, also including the development of probes as reported by Miaschi et al (2020), who showed that these features can be effectively used to profile the knowledge encoded in the language representations of a pretrained NLM.…”
Section: Linguistic Featuresmentioning
confidence: 96%