2020
DOI: 10.48550/arxiv.2006.00701
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Locally Differentially Private (Contextual) Bandits Learning

Abstract: We study locally differentially private (LDP) bandits learning in this paper. First, we propose simple black-box reduction frameworks that can solve a large family of context-free bandits learning problems with LDP guarantee. Based on our frameworks, we can improve previous best results for private bandits learning with one-point feedback, such as private Bandits Convex Optimization etc, and obtain the first results for Bandits Convex Optimization (BCO) with multi-point feedback under LDP. LDP guarantee and bl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(20 citation statements)
references
References 13 publications
(22 reference statements)
0
20
0
Order By: Relevance
“…Finally, it will also be interesting to consider other privacy guarantees beyond JDP and LDP, e.g., shuffle model of DP (Cheu et al, 2019), that achieves a smooth transition between JDP and LDP. This is particularly useful even in the simpler linear bandit setting as the regret under LDP is O(T 3/4 ) (Zheng et al, 2020) while the regret under JDP is O( √ T ) (Shariff and Sheffet, 2018). Thus, one interesting question is whether the shuffle model of DP can be utilized to achieve a similar regret as in JDP while providing the same strong privacy guarantee as in LDP, which is one of ongoing works.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Finally, it will also be interesting to consider other privacy guarantees beyond JDP and LDP, e.g., shuffle model of DP (Cheu et al, 2019), that achieves a smooth transition between JDP and LDP. This is particularly useful even in the simpler linear bandit setting as the regret under LDP is O(T 3/4 ) (Zheng et al, 2020) while the regret under JDP is O( √ T ) (Shariff and Sheffet, 2018). Thus, one interesting question is whether the shuffle model of DP can be utilized to achieve a similar regret as in JDP while providing the same strong privacy guarantee as in LDP, which is one of ongoing works.…”
Section: Discussionmentioning
confidence: 99%
“…Recently, another variant of DP, called local differential privacy (LDP) (Duchi et al, 2013) has gained increasing popularity in personalized services due to its stronger privacy protection. It has been studied in various bandit settings recently (Ren et al, 2020;Zheng et al, 2020;Zhou and Tan, 2020;Dubey, 2021). Under LDP, each user's raw data is directly protected before being sent to the learning agent.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Chen et al (2020) studied combinatorial bandits with LDP guarantees. Zheng et al (2020) studied both MABs and contextual bandits, and proposed a locally differentially private algorithm for contextual linear bandits. However, differentially private RL is much less studied compared with bandits, even though MDPs are more powerful since state transition is rather common in real applications.…”
Section: Introductionmentioning
confidence: 99%
“…T { pe ε pe ε ´1qqq lower bound for ε-LDP contextual linear bandits. This suggests that the algorithms proposed in Zheng et al (2020) might be improvable as well.…”
Section: Introductionmentioning
confidence: 99%