This letter proposes a novel communication-efficient and privacy-preserving distributed machine learning framework, coined Mix2FLD. To address uplink-downlink capacity asymmetry, local model outputs are uploaded to a server in the uplink as in federated distillation (FD), whereas global model parameters are downloaded in the downlink as in federated learning (FL). This requires a model output-to-parameter conversion at the server, after collecting additional data samples from devices. To preserve privacy while not compromising accuracy, linearly mixed-up local samples are uploaded, and inversely mixed up across different devices at the server. Numerical evaluations show that Mix2FLD achieves up to 16.7% higher test accuracy while reducing convergence time by up to 18.8% under asymmetric uplink-downlink channels compared to FL. Index Terms-Distributed machine learning, on-device learning, federated learning, federated distillation, uplink-downlink asymmetry.
To cope with the lack of on-device machine learning samples, this article presents a distributed data augmentation algorithm, coined federated data augmentation (FAug). In FAug, devices share a tiny fraction of their local data, i.e., seed samples, and collectively train a synthetic sample generator that can augment the local datasets of devices. To further improve FAug, we introduce a multi-hop based seed sample collection method and an oversampling technique that mixes up collected seed samples. Both approaches enjoy the benefit from the crowd of devices, by hiding data privacy from preceding hops and feeding diverse seed samples. In the image classification tasks, simulations demonstrate that the proposed FAug frameworks yield stronger privacy guarantees, lower communication latency, and higher on-device ML accuracy.
The elucidation of new structure–property relationships in π-conjugated molecules bearing quinoidal moieties is of relevance because of their use in organic electronics applications and their traditional assimilation as models of...
Personalization in federated learning (FL) functions as a coordinator for clients with high variance in data or behavior. Ensuring the convergence of these clients' models relies on how closely users collaborate with those with similar patterns or preferences. However, it is generally challenging to quantify similarity under limited knowledge about other users' models given to users in a decentralized network. To cope with this issue, we propose a personalized and fully decentralized FL algorithm, leveraging knowledge distillation techniques to empower each device so as to discern statistical distances between local models. Each client device can enhance its performance without sharing local data by estimating the similarity between two intermediate outputs from feeding local samples as in knowledge distillation. Our empirical studies demonstrate that the proposed algorithm improves the test accuracy of clients in fewer iterations under highly non-independent and identically distributed (non-i.i.d.) data distributions and is beneficial to agents with small datasets, even without the need for a central server.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.