2020
DOI: 10.36227/techrxiv.12645497
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Federated Reinforcement Distillation with Proxy Experience Memory

Abstract: This paper is presented at 28th International Joint Conference on Artificial Intelligence (IJCAI-19) 1st Wksp. Federated Machine Learning for User Privacy and Data Confidentiality (FML'19), Macau, August 2019.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
21
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7

Relationship

4
3

Authors

Journals

citations
Cited by 13 publications
(21 citation statements)
references
References 1 publication
0
21
0
Order By: Relevance
“…Similarly, a policy-based method in MARL can be improved in combination with FD. For actorcritic RL, FD is applicable to either one of the actor (policy) NN or the critic (value) NN or to both NNs [189]. However, with large input state and/or output dimensions, these approaches may incur huge communication and memory costs.…”
Section: ) Each Head Device Updates Its Primal Variables Asmentioning
confidence: 99%
“…Similarly, a policy-based method in MARL can be improved in combination with FD. For actorcritic RL, FD is applicable to either one of the actor (policy) NN or the critic (value) NN or to both NNs [189]. However, with large input state and/or output dimensions, these approaches may incur huge communication and memory costs.…”
Section: ) Each Head Device Updates Its Primal Variables Asmentioning
confidence: 99%
“…Applying Mix2up to other distributed learning scenarios could be an interesting topic for future research. Also, extending this idea to distributed reinforcement learning by leveraging the proxy experience memory method as in [6] as well as the convergence analysis of Mix2FLD is left to future work.…”
Section: Numerical Evaluation and Discussionmentioning
confidence: 99%
“…Further, with a certain degree of privacy protection, [23] establishes a fuzzy approximation of the local experience cache for each client before sharing with the server. So it natually is not feasible for online RL algorithms.…”
Section: B Federated Reinforcement Learningmentioning
confidence: 99%
“…However, some of the existing federated reinforcement learning methods are algorithm-independent but need directly sharing parameters of policy [17]- [22]. which will cause revealing privacy while others are limited to some specific scenarios [23], [24].…”
Section: Introductionmentioning
confidence: 99%