Efficient deployment of deep neural networks across many devices and resource constraints, especially on edge devices, is one of the most challenging problems in the presence of data-privacy preservation issues. Conventional approaches have evolved to either improve a single global model while keeping each local training data decentralized (i.e., data-heterogeneity) or to train a once-for-all network that supports diverse architectural settings to address heterogeneous systems equipped with different computational capabilities (i.e., model-heterogeneity). However, little research has considered both directions simultaneously. In this work, we propose a novel framework to consider both scenarios, namely Federation of Supernet Training (FedSup), where clients send and receive a supernet whereby it contains all possible architectures sampled from itself. It is inspired by how averaging parameters in the model aggregation stage of Federated Learning (FL) is similar to weight-sharing in supernet training. Specifically, in the FedSup framework, a weight-sharing approach widely used in the training single shot model is combined with the averaging of Federated Learning (FedAvg). Under our framework, we present an efficient algorithm (E-FedSup) by sending the submodel to clients on the broadcast stage to reduce communication costs and training overhead. We demonstrate several strategies to enhance supernet training in the FL environment and conduct extensive empirical evaluations. The resulting framework is shown to pave the way for the robustness of both data-and model-heterogeneity on several standard benchmarks.Recent FL works have been evolving into designing new objective functions for the aggregation of each model [1; 25; 34; 60; 13; 71; 31], using auxiliary data in the center server [39; 73], encoding the weight for an efficient communication stage [63; 23; 65], or recruiting helpful clients for more accurate global model [35; 8; 48]. On the other side, there has been tremendous recent interest in deploying the FL algorithms for real-world applications such as mobile devices and the Internet of Preprint. Under review.