2021
DOI: 10.1109/access.2021.3090019
|View full text |Cite
|
Sign up to set email alerts
|

Coded Machine Unlearning

Abstract: Models trained in machine learning processes may store information about individual samples used in the training process. There are many cases where the impact of an individual sample may need to be deleted and unlearned (i.e., removed) from the model. Retraining the model from scratch after removing a sample from its training set guarantees perfect unlearning, however, it becomes increasingly expensive as the size of training dataset increases. One solution to this issue is utilizing an ensemble learning meth… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
17
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(17 citation statements)
references
References 15 publications
0
17
0
Order By: Relevance
“…This ensures architecturally [6,11,53] or temporally [11,35] isolating the influence of any sample to a limited part of training, requiring retraining for only the affected parts. Isolation has been used across techniques like Linear Classification [6], Random Forest [13,54], KNN [1], SVM [16,63] and DNN [11,30,35] by utilizing or creating a sparse influence graph [53]. Datainfluence isolation often comes at the cost of utility as each portion becomes a weaker learner [8], especially in deep networks [58].…”
Section: Against Data-influence Isolationmentioning
confidence: 99%
See 2 more Smart Citations
“…This ensures architecturally [6,11,53] or temporally [11,35] isolating the influence of any sample to a limited part of training, requiring retraining for only the affected parts. Isolation has been used across techniques like Linear Classification [6], Random Forest [13,54], KNN [1], SVM [16,63] and DNN [11,30,35] by utilizing or creating a sparse influence graph [53]. Datainfluence isolation often comes at the cost of utility as each portion becomes a weaker learner [8], especially in deep networks [58].…”
Section: Against Data-influence Isolationmentioning
confidence: 99%
“…Isolation-based strategies change the training process by creating an ensemble [11,30,35,53], each of whose models is trained on different subsets of the dataset. This ensures architecturally [6,11,53] or temporally [11,35] isolating the influence of any sample to a limited part of training, requiring retraining for only the affected parts. Isolation has been used across techniques like Linear Classification [6], Random Forest [13,54], KNN [1], SVM [16,63] and DNN [11,30,35] by utilizing or creating a sparse influence graph [53].…”
Section: Against Data-influence Isolationmentioning
confidence: 99%
See 1 more Smart Citation
“…In this setting, the unlearning of data points can be efficiently carried out by only retraining the affected submodels. Aldaghri et al [3] show that this approach can be further sped up for least-squares regression by choosing the shards cleverly. Unlearning based on shards, however, is suitable for removing a few data points only and inevitably deteriorates in performance when larger portions of the data require changes.…”
Section: Related Workmentioning
confidence: 99%
“…Several approaches were presented, which apply to different models and come with different assumptions and restrictions. So far, these approaches cover unlearning in decision trees and random forests [3,22], linear models such as logistic regression [1,8,10], neural networks [2,7,8,9,10] and even Markov Chain Monte Carlo [6,19]. While most approaches focus on forgetting 1 in a single model, there also exist works that deal with federated models instead [29].…”
Section: Introduction and Related Workmentioning
confidence: 99%