Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021
DOI: 10.18653/v1/2021.emnlp-main.628
|View full text |Cite
|
Sign up to set email alerts
|

CAPE: Context-Aware Private Embeddings for Private Language Learning

Abstract: Neural language models have contributed to state-of-the-art results in a number of downstream applications including sentiment analysis, intent classification and others. However, obtaining text representations or embeddings using these models risks encoding personally identifiable information learned from language and context cues that may lead to privacy leaks. To ameliorate this issue, we propose Context-Aware Private Embeddings (CAPE), a novel approach which combines differential privacy and adversarial le… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(10 citation statements)
references
References 32 publications
0
10
0
Order By: Relevance
“…In order to remove the dependence of dimensionality in Unary Encoding mechanism, they propose an Optimized Multiple Encoding, which embeds vectors with a certain fixed size. Their post-processing procedure was then improved by (Plant et al, 2021). In (Plant et al, 2021) Habernal, 2021) also study privatizing word embeddings.…”
Section: Vanilla Dpmentioning
confidence: 99%
See 1 more Smart Citation
“…In order to remove the dependence of dimensionality in Unary Encoding mechanism, they propose an Optimized Multiple Encoding, which embeds vectors with a certain fixed size. Their post-processing procedure was then improved by (Plant et al, 2021). In (Plant et al, 2021) Habernal, 2021) also study privatizing word embeddings.…”
Section: Vanilla Dpmentioning
confidence: 99%
“…Their post-processing procedure was then improved by (Plant et al, 2021). In (Plant et al, 2021) Habernal, 2021) also study privatizing word embeddings. However, instead of using Unary Encoding or dropout, Krishna et al (2021) propose ADePT which is an auto-encoder based DP algorithm.…”
Section: Vanilla Dpmentioning
confidence: 99%
“…To obscure the representation itself, we add random perturbations to each representation of a random input, as shown in Figure 2. Following Plant et al (2021), we adopt Laplace Noise as the random perturbation. Formally, the process of mixing encryption and representation encryption for x 1:k i is:…”
Section: Privacy Mixermentioning
confidence: 99%
“…One solution is to apply cryptographic techniques to the PLM Chen et al, 2022), but these methods often come with significant communi-cation costs and computational time, making it difficult to apply them in real-world scenarios. Another potential solution is to remove the private information in word representations (Pan et al, 2020) through adversarial training (Li et al, 2018;Coavoux et al, 2018;Plant et al, 2021) and differential privacy (Lyu et al, 2020a;Hoory et al, 2021;Yue et al, 2021). However, the private information in our scenario pertains to each word in the user's plaintext.…”
Section: Introductionmentioning
confidence: 99%
“…In the context of document AI, such privacy techniques may be applied under different settings. For instance, an organization providing document AI services may train the models under global privacy constraints [22,30] to safeguard its own private data, or under local privacy constraints [31,32], where each individual client only uploads a private augmented data to the service, leaving no fingerprint that can be traced to the client. On the other hand, federated learning [23,24] may be deployed for private aggregation of data across multiple organizations or clients.…”
Section: Introductionmentioning
confidence: 99%