2022
DOI: 10.48550/arxiv.2202.13585
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Markov Chain Monte Carlo-Based Machine Unlearning: Unlearning What Needs to be Forgotten

Quoc Phong Nguyen,
Ryutaro Oikawa,
Dinil Mon Divakaran
et al.

Abstract: As the use of machine learning (ML) models is becoming increasingly popular in many real-world applications, there are practical challenges that need to be addressed for model maintenance. One such challenge is to 'undo' the effect of a specific subset of dataset used for training a model. This specific subset may contain malicious or adversarial data injected by an attacker, which affects the model performance. Another reason may be the need for a service provider to remove data pertaining to a specific user … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 18 publications
0
3
0
Order By: Relevance
“…Brophy [23] discussed a random forest algorithm to support model retraining with a subtree. Nguyen [24] presented a Markov chain Monte Carlo algorithm for extracting data samples to estimate the posterior belief of model parameters. However, these solutions have high resource consumption due to the malicious data distribution and historical training information, making them difficult to use in a generic case.…”
Section: Related Workmentioning
confidence: 99%
“…Brophy [23] discussed a random forest algorithm to support model retraining with a subtree. Nguyen [24] presented a Markov chain Monte Carlo algorithm for extracting data samples to estimate the posterior belief of model parameters. However, these solutions have high resource consumption due to the malicious data distribution and historical training information, making them difficult to use in a generic case.…”
Section: Related Workmentioning
confidence: 99%
“…Once the data of certain patients is confirmed to be used to train the target DL model by auditing, forgetting requires the removal of learnt information of certain patients' data from the target DL model, which is also called machine unlearning, while auditing could act as the verification of machine unlearning [18] In order to achieve forgetting, existing unlearning methods could be classified into three major classes, including model-agnostic methods, model-intrinsic methods and data-driven methods [20]. Model-agnostic methods refer to algorithms or frameworks that can be used for different DL models, including differential privacy [18], [21], [22], certified removal [23], [24], [25], statistical query learning [6], decremental learning [26], knowledge adaptation [27], [28] and parameter sampling [29]. Model-intrinsic approaches are those methods designed for specific types of models, such as for softmax classifiers [30], linear models [31], treebased models [32] and Bayesian models [19].…”
Section: Introductionmentioning
confidence: 99%
“…Several approaches were presented, which apply to different models and come with different assumptions and restrictions. So far, these approaches cover unlearning in decision trees and random forests [3,22], linear models such as logistic regression [1,8,10], neural networks [2,7,8,9,10] and even Markov Chain Monte Carlo [6,19]. While most approaches focus on forgetting 1 in a single model, there also exist works that deal with federated models instead [29].…”
Section: Introduction and Related Workmentioning
confidence: 99%