Federated Reinforcement Distillation with Proxy Experience Memory

Cha, Han; Park, Jihong; Kim, Hyesung; Kim, Seong-Lyun; Bennis, Mehdi

doi:10.36227/techrxiv.12645497

Cited by 13 publications

(21 citation statements)

References 1 publication

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Similarly, a policy-based method in MARL can be improved in combination with FD. For actorcritic RL, FD is applicable to either one of the actor (policy) NN or the critic (value) NN or to both NNs [189]. However, with large input state and/or output dimensions, these approaches may incur huge communication and memory costs.…”

Section: ) Each Head Device Updates Its Primal Variables Asmentioning

confidence: 99%

Wireless Network Intelligence at the Edge

et al. 2019

Self Cite

View full text Add to dashboard Cite

edge devices. The new breed of intelligent devices and high-stake applications (drones, augmented/virtual reality, autonomous systems, and so on) requires a novel paradigm change calling for distributed, low-latency and reliable ML at the wireless network edge (referred to as edge ML). In edge ML, training data are unevenly distributed over a large number of edge nodes, which have access to a tiny fraction of the data. Moreover, training and inference are carried out collectively over wireless links, where edge devices communicate and exchange their learned models (not their private data). In a first of its kind, this article explores the key building blocks of edge ML, different neural network architectural splits and their inherent tradeoffs, as well as theoretical and technical enablers stemming from a wide range of mathematical disciplines. Finally, several case studies pertaining to various high-stake applications are presented to demonstrate the effectiveness of edge ML in unlocking the full potential of 5G and beyond.

show abstract

Section: ) Each Head Device Updates Its Primal Variables Asmentioning

confidence: 99%

Wireless Network Intelligence at the Edge

et al. 2019

Self Cite

View full text Add to dashboard Cite

show abstract

“…Applying Mix2up to other distributed learning scenarios could be an interesting topic for future research. Also, extending this idea to distributed reinforcement learning by leveraging the proxy experience memory method as in [6] as well as the convergence analysis of Mix2FLD is left to future work.…”

Section: Numerical Evaluation and Discussionmentioning

confidence: 99%

Mix2FLD: Downlink Federated Learning After Uplink Federated Distillation With Two-Way Mixup

Oh¹,

Park²,

Jeong³

et al. 2020

IEEE Commun. Lett.

Self Cite

View full text Add to dashboard Cite

This letter proposes a novel communication-efficient and privacy-preserving distributed machine learning framework, coined Mix2FLD. To address uplink-downlink capacity asymmetry, local model outputs are uploaded to a server in the uplink as in federated distillation (FD), whereas global model parameters are downloaded in the downlink as in federated learning (FL). This requires a model output-to-parameter conversion at the server, after collecting additional data samples from devices. To preserve privacy while not compromising accuracy, linearly mixed-up local samples are uploaded, and inversely mixed up across different devices at the server. Numerical evaluations show that Mix2FLD achieves up to 16.7% higher test accuracy while reducing convergence time by up to 18.8% under asymmetric uplink-downlink channels compared to FL. Index Terms-Distributed machine learning, on-device learning, federated learning, federated distillation, uplink-downlink asymmetry.

show abstract

“…Further, with a certain degree of privacy protection, [23] establishes a fuzzy approximation of the local experience cache for each client before sharing with the server. So it natually is not feasible for online RL algorithms.…”

Section: B Federated Reinforcement Learningmentioning

confidence: 99%

“…However, some of the existing federated reinforcement learning methods are algorithm-independent but need directly sharing parameters of policy [17]- [22]. which will cause revealing privacy while others are limited to some specific scenarios [23], [24].…”

Section: Introductionmentioning

confidence: 99%

Reward Shaping Based Federated Reinforcement Learning

Hua

Liu

et al. 2021

IEEE Access

View full text Add to dashboard Cite

Federated reinforcement learning aims to promote training efficiency or improve policy quality through information interaction with privacy protection. Existing federated reinforcement learning methods rarely utilize the structure of reinforcement learning algorithms while are limiting to specific scenarios or algorithms. We propose a general federated reinforcement learning framework FRS, which employs reward shaping as the federated information shared among different clients with different tasks to promote each client's training speed and policy quality. The federated reward shaping is implicitly learned by average state value information of all clients to protect each client's task privacy as the real trajectory is anonymous. Experiments on the GridWorld environment show that FRS can algorithm-independently improve the policy quality and promote training speed with protecting each client's privacy.

show abstract

Federated Reinforcement Distillation with Proxy Experience Memory

Abstract: This paper is presented at 28th International Joint Conference on Artificial Intelligence (IJCAI-19) 1st Wksp. Federated Machine Learning for User Privacy and Data Confidentiality (FML'19), Macau, August 2019.

Cited by 13 publications

References 1 publication

Wireless Network Intelligence at the Edge

Wireless Network Intelligence at the Edge

Mix2FLD: Downlink Federated Learning After Uplink Federated Distillation With Two-Way Mixup

Reward Shaping Based Federated Reinforcement Learning

Contact Info

Product

Resources

About