2022
DOI: 10.48550/arxiv.2201.07052
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Differentially Private Reinforcement Learning with Linear Function Approximation

Abstract: Motivated by the wide adoption of reinforcement learning (RL) in real-world personalized services, where users' sensitive and private information needs to be protected, we study regret minimization in finite horizon Markov decision processes (MDPs) under the constraints of differential privacy (DP). Compared to existing private RL algorithms that work only on tabular finite-state, finite-actions MDPs, we take the first step towards privacy-preserving learning in MDPs with large state and action spaces. Specifi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 25 publications
0
2
0
Order By: Relevance
“…Intelligent algorithms can handle more complex problems and situations, even if little information about the model and the goal is available; thus, they can expand the horizons of privacy protection. We can further observe that inverse reinforcement learning has been adopted to address these privacy problems [71,72]. In fact, however, inverse reinforcement learning is more frequently used to approach the problem of inferring an expert's reward function from demonstrations and to provide the reward to the learning system rather than to tackle security and/or privacy issues.…”
Section: Summary and Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Intelligent algorithms can handle more complex problems and situations, even if little information about the model and the goal is available; thus, they can expand the horizons of privacy protection. We can further observe that inverse reinforcement learning has been adopted to address these privacy problems [71,72]. In fact, however, inverse reinforcement learning is more frequently used to approach the problem of inferring an expert's reward function from demonstrations and to provide the reward to the learning system rather than to tackle security and/or privacy issues.…”
Section: Summary and Discussionmentioning
confidence: 99%
“…Zhou et al [71] proposed to protect users' sensitive and private information by considering regret minimization in large state and action spaces. Their work used the notion of joint differential privacy (JDP) and considered MDPs by means of linear function approximation.…”
Section: Researches In Privacy Of Environmentmentioning
confidence: 99%