Markov Chain Monte Carlo-Based Machine Unlearning: Unlearning What Needs to be Forgotten

Nguyen, Quoc Phong; Oikawa, Ryutaro; Divakaran, Dinil Mon; Chan, Mun Choon; Low, Bryan Kian Hsiang

doi:10.48550/arxiv.2202.13585

Cited by 3 publications

(3 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Brophy [23] discussed a random forest algorithm to support model retraining with a subtree. Nguyen [24] presented a Markov chain Monte Carlo algorithm for extracting data samples to estimate the posterior belief of model parameters. However, these solutions have high resource consumption due to the malicious data distribution and historical training information, making them difficult to use in a generic case.…”

Section: Related Workmentioning

confidence: 99%

Fast and Accurate SNN Model Strengthening for Industrial Applications

Zhou,

Chen,

Chen

et al. 2023

Electronics

View full text Add to dashboard Cite

In spiking neural networks (SNN), there are emerging security threats, such as adversarial samples and poisoned data samples, which reduce the global model performance. Therefore, it is an important issue to eliminate the impact of malicious data samples on the whole model. In SNNs, a naive solution is to delete all malicious data samples and retrain the entire dataset. In the era of large models, this is impractical due to the huge computational complexity. To address this problem, we present a novel SNN model strengthening method to support fast and accurate removal of malicious data from a trained model. Specifically, we use untrained data that has the same distribution as the training data. We can infer that the untrained data has no effect on the initial model, and the malicious data should have no effect on the final refined model. Thus, we can use the model output of the untrained data with respect to the initial model to guide the final refined model. In this way, we present a stochastic gradient descent method to iteratively determine the final model. We perform a comprehensive performance evaluation on two industrial steel surface datasets. Experimental results show that our model strengthening method can provide accurate malicious data elimination, with speeds 11.7× to 27.2× faster speeds than the baseline method.

show abstract

Section: Related Workmentioning

confidence: 99%

Fast and Accurate SNN Model Strengthening for Industrial Applications

Zhou,

Chen,

Chen

et al. 2023

Electronics

View full text Add to dashboard Cite

show abstract

“…Once the data of certain patients is confirmed to be used to train the target DL model by auditing, forgetting requires the removal of learnt information of certain patients' data from the target DL model, which is also called machine unlearning, while auditing could act as the verification of machine unlearning [18] In order to achieve forgetting, existing unlearning methods could be classified into three major classes, including model-agnostic methods, model-intrinsic methods and data-driven methods [20]. Model-agnostic methods refer to algorithms or frameworks that can be used for different DL models, including differential privacy [18], [21], [22], certified removal [23], [24], [25], statistical query learning [6], decremental learning [26], knowledge adaptation [27], [28] and parameter sampling [29]. Model-intrinsic approaches are those methods designed for specific types of models, such as for softmax classifiers [30], linear models [31], treebased models [32] and Bayesian models [19].…”

Section: Introductionmentioning

confidence: 99%

Audit to Forget: A Unified Method to Revoke Patients’ Private Data in Intelligent Healthcare

Zhou

Liao

et al. 2023

Preprint

View full text Add to dashboard Cite

Revoking personal private data is one of the basic human rights, which has already been sheltered by several privacy-preserving laws in many countries. However, with the development of data science, machine learning and deep learning techniques, this right is usually neglected or violated as more and more patients' data are being collected and used for model training, especially in intelligent healthcare, thus making intelligent healthcare a sector where technology must meet the law, regulations, and privacy principles to ensure that the innovation is for the common good. In order to secure patients' right to be forgotten, we proposed a novel solution by using auditing to guide the forgetting process, where auditing means determining whether a dataset has been used to train the model and forgetting requires the information of a query dataset to be forgotten from the target model. We unified these two tasks by introducing a new approach called knowledge purification. To implement our solution, we developed AFS, a unified open-source software, which is able to evaluate and revoke patients' private data from pre-trained deep learning models. We demonstrated the generality of AFS by applying it to four tasks on different datasets with various data sizes and architectures of deep learning networks. The software is publicly available at \url{https://github.com/JoshuaChou2018/AFS}.

show abstract

“…Several approaches were presented, which apply to different models and come with different assumptions and restrictions. So far, these approaches cover unlearning in decision trees and random forests [3,22], linear models such as logistic regression [1,8,10], neural networks [2,7,8,9,10] and even Markov Chain Monte Carlo [6,19]. While most approaches focus on forgetting 1 in a single model, there also exist works that deal with federated models instead [29].…”

Section: Introduction and Related Workmentioning

confidence: 99%

Evaluating Machine Unlearning via Epistemic Uncertainty

Becker¹,

Liebig²

2022

Preprint

View full text Add to dashboard Cite

There has been a growing interest in Machine Unlearning recently, primarily due to legal requirements such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act. Thus, multiple approaches were presented to remove the influence of specific target data points from a trained model. However, when evaluating the success of unlearning, current approaches either use adversarial attacks or compare their results to the optimal solution, which usually incorporates retraining from scratch. We argue that both ways are insufficient in practice. In this work, we present an evaluation metric for Machine Unlearning algorithms based on epistemic uncertainty. This is the first definition of a general evaluation metric for Machine Unlearning to our best knowledge.

show abstract

Markov Chain Monte Carlo-Based Machine Unlearning: Unlearning What Needs to be Forgotten

Cited by 3 publications

References 18 publications

Fast and Accurate SNN Model Strengthening for Industrial Applications

Fast and Accurate SNN Model Strengthening for Industrial Applications

Audit to Forget: A Unified Method to Revoke Patients’ Private Data in Intelligent Healthcare

Evaluating Machine Unlearning via Epistemic Uncertainty

Contact Info

Product

Resources

About