Xudong Pan scite author profile

Recently, a new paradigm of building generalpurpose language models (e.g., Google's Bert and OpenAI's GPT-2) in Natural Language Processing (NLP) for text feature extraction, a standard procedure in NLP systems that converts texts to vectors (i.e., embeddings) for downstream modeling, has arisen and starts to find its application in various downstream NLP tasks and real world systems (e.g., Google's search engine [6]). To obtain general-purpose text embeddings, these language models have highly complicated architectures with millions of learnable parameters and are usually pretrained on billions of sentences before being utilized. As is widely recognized, such a practice indeed improves the state-of-the-art performance of many downstream NLP tasks.However, the improved utility is not for free. We find the text embeddings from general-purpose language models would capture much sensitive information from the plain text. Once being accessed by the adversary, the embeddings can be reverseengineered to disclose sensitive information of the victims for further harassment. Although such a privacy risk can impose a real threat to the future leverage of these promising NLP tools, there are neither published attacks nor systematic evaluations by far for the mainstream industry-level language models.To bridge this gap, we present the first systematic study on the privacy risks of 8 state-of-the-art language models with 4 diverse case studies. By constructing 2 novel attack classes, our study demonstrates the aforementioned privacy risks do exist and can impose practical threats to the application of general-purpose language models on sensitive data covering identity, genome, healthcare and location. For example, we show the adversary with nearly no prior knowledge can achieve about 75% accuracy when inferring the precise disease site from Bert embeddings of patients' medical descriptions. As possible countermeasures, we propose 4 different defenses (via rounding, differential privacy, adversarial training and subspace projection) to obfuscate the unprotected embeddings for mitigation purpose. With extensive evaluations, we also provide a preliminary analysis on the utilityprivacy trade-off brought by each defense, which we hope may foster future mitigation researches.

show abstract

Modeling Extreme Events in Time Series Prediction

Ding

Zhang

Pan

et al. 2019

108

View full text Add to dashboard Cite

Design and accuracy analysis of pneumatic gauging for form error of spool valve inner hole

Liu

Pan

Wang

et al. 2012

Flow Measurement and Instrumentation

View full text Add to dashboard Cite

Torque tracking control of electric load simulator with active motion disturbance and nonlinearity based on T‐S fuzzy model

Pan

Wang

2018

Asian Journal of Control

View full text Add to dashboard Cite

This paper develops a high performance nonlinear control method for an electric load simulator (ELS). The tracking performance of the ELS is mainly affected by the actuator's active motion disturbance and friction nonlinearity. First, a nonlinear model of ELS is developed, and then the Takagi-Sugeno fuzzy model is used to represent the friction nonlinearity ofthe ELS. A state observer is constructed to estimate the speed of the load system. For converting the tracking control into a stabilization problem, a new control design called virtual desired state synthesis is proposed to define the internal desired states. External disturbances are attenuated based on an H ∞ criterion and the stability of the entire closed-loop model is investigated using the well-known quadratic Lyapunov function. Meanwhile, the feedback gains and the observer gains are obtained separately by solving a set of linear matrix inequalities (LMIs). Both a simulation and experiment were performed to validate the effectiveness of the developed algorithm. KEYWORDS T-S fuzzy modelH ∞ criterionelectric load simulator (ELS), linear matrix inequalities (LMIs), torque tracking control

show abstract

Application of an assistant teaching system based on mobile augmented reality (AR) for course design of mechanical manufacturing process

Pan

Sun

Wang

et al. 2017

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xudong Pan

Privacy Risks of General-Purpose Language Models

Modeling Extreme Events in Time Series Prediction

Design and accuracy analysis of pneumatic gauging for form error of spool valve inner hole

Torque tracking control of electric load simulator with active motion disturbance and nonlinearity based on T‐S fuzzy model

Application of an assistant teaching system based on mobile augmented reality (AR) for course design of mechanical manufacturing process

Contact Info

Product

Resources

About