2022
DOI: 10.48550/arxiv.2210.01504
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Knowledge Unlearning for Mitigating Privacy Risks in Language Models

Abstract: Pretrained Language Models (LMs) memorize a vast amount of knowledge during initial pretraining, including information that may violate the privacy of personal lives and identities. Previous work addressing privacy issues for language models has mostly focused on data preprocessing and differential privacy methods, both requiring re-training the underlying LM. We propose knowledge unlearning as an alternative method to reduce privacy risks for LMs post hoc. We show that simply applying the unlikelihood trainin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(14 citation statements)
references
References 26 publications
0
14
0
Order By: Relevance
“…To mitigate the costs associated with retraining LLMs for each forget request, approximate unlearning methods have been proposed (Pawelczyk et al, 2023;Chen & Yang, 2023;Meng et al, 2022;Jang et al, 2022;Kassem et al, 2023) which remove the contribution of the target data points from the pretrained model.…”
Section: Machine Unlearning For Memorized Datamentioning
confidence: 99%
See 4 more Smart Citations
“…To mitigate the costs associated with retraining LLMs for each forget request, approximate unlearning methods have been proposed (Pawelczyk et al, 2023;Chen & Yang, 2023;Meng et al, 2022;Jang et al, 2022;Kassem et al, 2023) which remove the contribution of the target data points from the pretrained model.…”
Section: Machine Unlearning For Memorized Datamentioning
confidence: 99%
“…The current SOTA approach for our task at hand (also regarded as SOTA in (Liu & Kalinli, 2023;Kassem et al, 2023;Pawelczyk et al, 2023) is the work of (Jang et al, 2022), who uses gradient ascent (GA) oblivious to the pretraining dataset to minimize exact memorization over a set of textual sequences. It shows that simply maximizing the loss with respect to the textual sequences to be forgotten produces a competitive trade-off between privacy and model utility.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations