Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022
DOI: 10.18653/v1/2022.acl-long.504
|View full text |Cite
|
Sign up to set email alerts
|

Pretraining with Artificial Language: Studying Transferable Knowledge in Language Models

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(10 citation statements)
references
References 0 publications
0
10
0
Order By: Relevance
“…47 We also note that Huang et al have used atomic energies to train NN-based atomistic models. 69 In a wider perspective, the pre-training of NN models is a well-documented approach in the ML literature for various applications and domains, [70][71][72][73][74] and it has very recently been described in the context of interatomic potential models, 47,75,76 property prediction with synthetic pre-training data, 77 and as a means to learn generalpurpose representations for atomistic structure. 76…”
Section: Digital Discovery Accepted Manuscriptmentioning
confidence: 99%
“…47 We also note that Huang et al have used atomic energies to train NN-based atomistic models. 69 In a wider perspective, the pre-training of NN models is a well-documented approach in the ML literature for various applications and domains, [70][71][72][73][74] and it has very recently been described in the context of interatomic potential models, 47,75,76 property prediction with synthetic pre-training data, 77 and as a means to learn generalpurpose representations for atomistic structure. 76…”
Section: Digital Discovery Accepted Manuscriptmentioning
confidence: 99%
“…Recent success of transfer learning shows that pre-training (or continue pre-training) with similar source tasks can help better solve downstream target task (e.g., question answering (Khashabi et al, 2020;Liu et al, 2021b), face verification (Cao et al, 2013), and general NLU tasks (Pruksachatkun et al, 2020)). Some previous work in cross-lingual transfer learning empirically observed that the model can transfer some knowledge beyond vocabulary (Artetxe et al, 2020;Ri & Tsuruoka, 2022), but they did not consider to exclude the influence from other potential factors. Our results can serve as stronger evidence for the reason to the success of transfer learning, that in addition to transferring some surface patterns, the better target performance can also benefit from similar abstract concepts learned from source tasks.…”
Section: A Discussionmentioning
confidence: 99%
“…This makes understanding what the model can do much easier. There is a growing number of studies that have made use of artificial language corpora to understand the representations learned by complex models (Asr & Jones, 2017;Elman, 1990Elman, , 1991Elman, , 1993Frank et al, 2009;Mao et al, 2022;Perruchet & Vinter, 1998;Ravfogel et al, 2019;Ri & Tsuruoka, 2022;Rohde & Plaut, 1999;Rubin et al, 2014;St. Clair et al, 2009;Tabullo et al, 2012;Wang & Eisner, 2016;White & Cotterell, 2021;Willits, 2013).…”
Section: A World For Wordsmentioning
confidence: 99%