Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2022
DOI: 10.18653/v1/2022.naacl-main.341
|View full text |Cite
|
Sign up to set email alerts
|

Symbolic Knowledge Distillation: from General Language Models to Commonsense Models

Abstract: The common practice for training commonsense models has gone from-human-to-corpusto-machine: humans author commonsense knowledge graphs in order to train commonsense models. In this work, we investigate an alternative, from-machine-to-corpus-tomachine: general language models author these commonsense knowledge graphs to train commonsense models.Our study leads to a new framework, Symbolic Knowledge Distillation. As with prior art in Knowledge Distillation (Hinton et al., 2015), our approach uses larger models … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

2
59
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 74 publications
(62 citation statements)
references
References 41 publications
2
59
0
1
Order By: Relevance
“…Recent work augmented datasets by fine-tuning a pre-trained LM on real data, then generated new, silver-labelled instances (Anaby-Tavor et al, 2020;Papanikolaou and Pierleoni, 2020;Kumar et al, 2020). Similarly, the few-shot capabilities of GPT-3 (Brown et al, 2020) were leveraged to generate free-text explanations (Wiegreffe et al, 2022), semantically-related sentence pairs (Schick and Schütze, 2021), atomic event commonsense triples (West et al, 2022), and labels for various generation and understanding tasks . In this work, we finetune GPT-3 with minimal human supervision to generate additional contextual data pertaining to events.…”
Section: Lm-generated Data Augmentationmentioning
confidence: 99%
“…Recent work augmented datasets by fine-tuning a pre-trained LM on real data, then generated new, silver-labelled instances (Anaby-Tavor et al, 2020;Papanikolaou and Pierleoni, 2020;Kumar et al, 2020). Similarly, the few-shot capabilities of GPT-3 (Brown et al, 2020) were leveraged to generate free-text explanations (Wiegreffe et al, 2022), semantically-related sentence pairs (Schick and Schütze, 2021), atomic event commonsense triples (West et al, 2022), and labels for various generation and understanding tasks . In this work, we finetune GPT-3 with minimal human supervision to generate additional contextual data pertaining to events.…”
Section: Lm-generated Data Augmentationmentioning
confidence: 99%
“…Recent approaches have shown a great potential to incorporate external knowledge for knowledgebased VQA. Several methods explore aggregating the external knowledge either in the form of structured knowledge graphs (Garderes et al, 2020;Narasimhan et al, 2018;Li et al, 2020b;Wang et al, 2017a,b), unstructured knowledge bases (Marino et al, 2021;Wu et al, 2022;Luo et al, 2021), and neural-symbolic inference based knowledge (Chen et al, 2020;West et al, 2021). In these methods, object detectors (Ren et al, 2015) and scene classifiers (He et al, 2016) are used to associate images with external knowledge.…”
Section: Related Workmentioning
confidence: 99%
“…Commonsense knowledge acquisition is a longstanding challenge in natural language processing (Charniak, 1973;Hwang et al, 2021;Zhang et al, 2021), and current approaches rely on knowledge acquired by pre-trained Transformer language models (Bosselut et al, 2019;Zhang et al, 2020;West et al, 2021). The commonsense reasoning ability of these language models has been evaluated using behavioral probes (Ettinger, 2020;Misra et al, 2021;He et al, 2021) and downstream, fine-tuned evaluations (Banerjee et al, 2021;Zhou et al, 2021;.…”
Section: Related Workmentioning
confidence: 99%