Coded Machine Unlearning

Aldaghri, Nasser; Mahdavifar, Hessam; Beirami, Ahmad

doi:10.1109/access.2021.3090019

Cited by 16 publications

(17 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This ensures architecturally [6,11,53] or temporally [11,35] isolating the influence of any sample to a limited part of training, requiring retraining for only the affected parts. Isolation has been used across techniques like Linear Classification [6], Random Forest [13,54], KNN [1], SVM [16,63] and DNN [11,30,35] by utilizing or creating a sparse influence graph [53]. Datainfluence isolation often comes at the cost of utility as each portion becomes a weaker learner [8], especially in deep networks [58].…”

Section: Against Data-influence Isolationmentioning

confidence: 99%

“…Isolation-based strategies change the training process by creating an ensemble [11,30,35,53], each of whose models is trained on different subsets of the dataset. This ensures architecturally [6,11,53] or temporally [11,35] isolating the influence of any sample to a limited part of training, requiring retraining for only the affected parts. Isolation has been used across techniques like Linear Classification [6], Random Forest [13,54], KNN [1], SVM [16,63] and DNN [11,30,35] by utilizing or creating a sparse influence graph [53].…”

Section: Against Data-influence Isolationmentioning

confidence: 99%

“…• Dataset D M IA is created with the probability outputs for class t: M (D f ) t and M (D u ) t stored as class 0 and class 1 respectively. • We then create a 50-50 6 shadow (D M IA−S ) -test (D M IA−T ) split of D M IA . • A threshold p t needs to be chosen such that probabilities > p t are classified as class 0, and probabilities < p t as class 1.…”

Section: B2 Our Formulationmentioning

confidence: 99%

See 2 more Smart Citations

Towards Adversarial Evaluations for Inexact Machine Unlearning

Goel¹,

Prabhu²,

Kumaraguru³

2022

Preprint

View full text Add to dashboard Cite

Existing works in inexact machine unlearning focus on achieving indistinguishability from models retrained after removing the deletion set. We argue that indistinguishability is unnecessary, infeasible to measure, and its practical relaxations can be insufficient. We redefine the goal of unlearning as forgetting all information specific to the deletion set while maintaining high utility and resource efficiency.Motivated by the practical application of removing mislabelled and biased data from models, we introduce a novel test to measure the degree of forgetting called Interclass Confusion (IC). It allows us to analyze two aspects of forgetting: (i) memorization and (ii) property generalization. Despite being a black-box test, IC can investigate whether information from the deletion set was erased until the early layers of the network. We empirically show that two simple unlearning methods, exact-unlearning and catastrophic-forgetting the final k layers of a network, scale well to large deletion sets unlike prior unlearning methods. k controls the forgettingefficiency tradeoff at similar utility. Overall, we believe our formulation of unlearning and the IC test will guide the design of better unlearning algorithms.* Equal Contribution 1 to the extent that bias is a dataset problem [37].

show abstract

Section: Against Data-influence Isolationmentioning

confidence: 99%

Section: Against Data-influence Isolationmentioning

confidence: 99%

Section: B2 Our Formulationmentioning

confidence: 99%

See 1 more Smart Citation

Towards Adversarial Evaluations for Inexact Machine Unlearning

Goel¹,

Prabhu²,

Kumaraguru³

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…In this setting, the unlearning of data points can be efficiently carried out by only retraining the affected submodels. Aldaghri et al [3] show that this approach can be further sped up for least-squares regression by choosing the shards cleverly. Unlearning based on shards, however, is suitable for removing a few data points only and inevitably deteriorates in performance when larger portions of the data require changes.…”

Section: Related Workmentioning

confidence: 99%

Machine Unlearning of Features and Labels

Warnecke¹,

Pirch²,

Wressnegger³

et al. 2021

Preprint

View full text Add to dashboard Cite

Removing information from a machine learning model is a non-trivial task that requires to partially revert the training process. This task is unavoidable when sensitive data, such as credit card numbers or passwords, accidentally enter the model and need to be removed afterwards. Recently, different concepts for machine unlearning have been proposed to address this problem. While these approaches are effective in removing individual data points, they do not scale to scenarios where larger groups of features and labels need to be reverted.In this paper, we propose a method for unlearning features and labels. Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters. It enables to adapt the influence of training data on a learning model retrospectively, thereby correcting data leaks and privacy issues. For learning models with strongly convex loss functions, our method provides certified unlearning with theoretical guarantees. For models with non-convex losses, we empirically show that unlearning features and labels is effective and significantly faster than other strategies.

show abstract

“…Several approaches were presented, which apply to different models and come with different assumptions and restrictions. So far, these approaches cover unlearning in decision trees and random forests [3,22], linear models such as logistic regression [1,8,10], neural networks [2,7,8,9,10] and even Markov Chain Monte Carlo [6,19]. While most approaches focus on forgetting 1 in a single model, there also exist works that deal with federated models instead [29].…”

Section: Introduction and Related Workmentioning

confidence: 99%

Evaluating Machine Unlearning via Epistemic Uncertainty

Becker¹,

Liebig²

2022

Preprint

View full text Add to dashboard Cite

There has been a growing interest in Machine Unlearning recently, primarily due to legal requirements such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act. Thus, multiple approaches were presented to remove the influence of specific target data points from a trained model. However, when evaluating the success of unlearning, current approaches either use adversarial attacks or compare their results to the optimal solution, which usually incorporates retraining from scratch. We argue that both ways are insufficient in practice. In this work, we present an evaluation metric for Machine Unlearning algorithms based on epistemic uncertainty. This is the first definition of a general evaluation metric for Machine Unlearning to our best knowledge.

show abstract

Coded Machine Unlearning

Cited by 16 publications

References 15 publications

Towards Adversarial Evaluations for Inexact Machine Unlearning

Towards Adversarial Evaluations for Inexact Machine Unlearning

Machine Unlearning of Features and Labels

Evaluating Machine Unlearning via Epistemic Uncertainty

Contact Info

Product

Resources

About