Private aggregation of teacher ensembles (PATE), a general machine learning framework based on knowledge distillation, can provide a privacy guarantee for training data sets. However, this framework poses a number of security risks. First, PATE mainly focuses
| INTRODUCTIONMachine learning (ML) has been widely used in computer vision, natural language processing, genomics, and other fields. In recent years, however, the emergence of privacy leakage issues has adversely affected the practical application of ML, where highly sensitive data, such as medical record data, 1 is used for training models. Some recent studies have shown that ML algorithms are extremely vulnerable to malicious attacks. Considering that an ML model implies some pieces of information of the training data, an attacker can use this property to analyze the model parameters or output, and eventually derive partially or fully private information about the training data. For example, Tramer et al. 2 deployed a model extraction attack on the online machine learning as a service (MLaaS) of Google and Amazon, and they successfully obtained a model similar to the original one. Fredrikson et al. 3 identified the original training data by analyzing the probability information output through the model classifier. The member inference attack designed by Shorki et al. 4 can determine whether a specific data sample is present in the model training data set on the basis of the prediction result.To solve privacy risks brought by the aforementioned attacks, Papernot et al. 5,6 proposed a privacy-preserving model called private aggregation of teacher ensembles (PATE) on the basis of multiteacher knowledge transfer. The core idea of PATE is as follows: if independent classifiers that are trained on disjoint data sets show a high degree of consistency for the same input, then the output will not leak any information about the training data. Therefore, PATE first divides a private data set into multiple disjoint subsets and trains a set of teacher models on these subsets. Then, the knowledge of the teacher models is transferred to a student model through an aggregation mechanism that satisfies differential privacy; that is, all teacher models predict the label for the public data set submitted by a student. In the end, only the student model gets published after training on public data with privacy-preserving labels obtained from teachers. An adversary can only have access to the student model, and thus, the privacy of the training data of the teacher models gets protected.Although PATE provides a flexible approach to the training model with privacy guarantees, it still suffers from several limitations and shortcomings. 7 First, PATE assumes that the data submitted by a student is always public without anything sensitive. That is to say, PATE cannot provide a privacy guarantee at all when the data provided by the student model is private. When teacher models prepare to make predictions on a student's private inputs, information of the data may be leaked directly t...