Machine Unlearning of Features and Labels

Warnecke, Alexander; Pirch, Lukas; Wressnegger, Christian; Rieck, Konrad

doi:10.48550/arxiv.2108.11577

Cited by 5 publications

(5 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Let p(n) be the probability that all P portions are affected on the removal of n samples. Extending the analysis of Warnecke et al [67] from the specific case of SISA to data-influence isolation in general, we get: is a fast increase in the chance of needing a full-retrain as deletion sets get larger. This demonstrates how datainfluence isolation provides little improvement in efficiency compared to the retrain-from-scratch baseline for practical scenarios.…”

Section: D1 Against Isolation Strategiesmentioning

confidence: 88%

“…Inexact-unlearning: The primary goal of forgetting can often be hard to achieve, especially in deep networks. Hence, inexact-unlearning literature has relaxed the forgetting goal in two ways: not provable [10,32,35,39,45,56,67] and imperfect [10,31,32,35,39,51,65,68]. Relaxing provability implies unlearning methods do not provide any proven guarantees of information removal.…”

Section: Problem Formulationmentioning

confidence: 99%

See 1 more Smart Citation

Towards Adversarial Evaluations for Inexact Machine Unlearning

Goel¹,

Prabhu²,

Kumaraguru³

2022

Preprint

View full text Add to dashboard Cite

Existing works in inexact machine unlearning focus on achieving indistinguishability from models retrained after removing the deletion set. We argue that indistinguishability is unnecessary, infeasible to measure, and its practical relaxations can be insufficient. We redefine the goal of unlearning as forgetting all information specific to the deletion set while maintaining high utility and resource efficiency.Motivated by the practical application of removing mislabelled and biased data from models, we introduce a novel test to measure the degree of forgetting called Interclass Confusion (IC). It allows us to analyze two aspects of forgetting: (i) memorization and (ii) property generalization. Despite being a black-box test, IC can investigate whether information from the deletion set was erased until the early layers of the network. We empirically show that two simple unlearning methods, exact-unlearning and catastrophic-forgetting the final k layers of a network, scale well to large deletion sets unlike prior unlearning methods. k controls the forgettingefficiency tradeoff at similar utility. Overall, we believe our formulation of unlearning and the IC test will guide the design of better unlearning algorithms.* Equal Contribution 1 to the extent that bias is a dataset problem [37].

show abstract

Section: D1 Against Isolation Strategiesmentioning

confidence: 88%

Section: Problem Formulationmentioning

confidence: 99%

Towards Adversarial Evaluations for Inexact Machine Unlearning

Goel¹,

Prabhu²,

Kumaraguru³

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…At lower levels, split variables are selected to greedily maximize splitting conditions such as the Gini index or mutual information. Other methods, including [8,[40][41][42], require parameters to be stored during training, the dwelling of note as their other more salient details we will detail subsequently in other parts.…”

Section: ) In-processing Mumentioning

confidence: 99%

Ensuring User Privacy and Model Security via Machine Unlearning: A Review

Tang,

Cai,

Liu

et al. 2023

CMC

View full text Add to dashboard Cite

As an emerging discipline, machine learning has been widely used in artificial intelligence, education, meteorology and other fields. In the training of machine learning models, trainers need to use a large amount of practical data, which inevitably involves user privacy. Besides, by polluting the training data, a malicious adversary can poison the model, thus compromising model security. The data provider hopes that the model trainer can prove to them the confidentiality of the model. Trainer will be required to withdraw data when the trust collapses. In the meantime, trainers hope to forget the injected data to regain security when finding crafted poisoned data after the model training. Therefore, we focus on forgetting systems, the process of which we call machine unlearning, capable of forgetting specific data entirely and efficiently. In this paper, we present the first comprehensive survey of this realm. We summarize and categorize existing machine unlearning methods based on their characteristics and analyze the relation between machine unlearning and relevant fields (e.g., inference attacks and data poisoning attacks). Finally, we briefly conclude the existing research directions.

show abstract

“…Sekari et al [25] studied the difference between differential privacy and machine unlearning. Warnecke et al [26] scaled to the problem of forgetting a group of features and labels from a model. Unlearning for Bayesian methods [27], k-means clustering systems [10], and random forests [8] have also been explored.…”

Section: Related Workmentioning

confidence: 99%

Deep Regression Unlearning

Tarun¹,

Chundawat²,

Mandal³

et al. 2022

Preprint

View full text Add to dashboard Cite

With the introduction of data protection and privacy regulations, it has become crucial to remove the lineage of data on demand in a machine learning system. In past few years, there has been notable development in machine unlearning to remove the information of certain training data points efficiently and effectively from the model. In this work, we explore unlearning in a regression problem, particularly in deep learning models. Unlearning in classification and simple linear regression has been investigated considerably. However, unlearning in deep regression models largely remain an untouched problem till now. In this work, we introduce deep regression unlearning methods that are well generalized and robust to privacy attacks. We propose the Blindspot unlearning method which uses a novel weight optimization process. A randomly initialized model, partially exposed to the retain samples and a copy of original model are used together to selectively imprint knowledge about the data that we wish to keep and scrub the information of the data we wish to forget. We also propose a Gaussian distribution based fine tuning method for regression unlearning. The existing evaluation metrics for unlearning in a classification task are not directly applicable for regression unlearning. Therefore, we adapt these metrics for regression task. We devise a membership inference attack to check the privacy leaks in the unlearned regression model. We conduct the experiments on regression tasks for computer vision, natural language processing and forecasting applications. Our deep regression unlearning methods show excellent performance across all of these datasets and metrics.

show abstract

Machine Unlearning of Features and Labels

Cited by 5 publications

References 23 publications

Towards Adversarial Evaluations for Inexact Machine Unlearning

Towards Adversarial Evaluations for Inexact Machine Unlearning

Ensuring User Privacy and Model Security via Machine Unlearning: A Review

Deep Regression Unlearning

Contact Info

Product

Resources

About