We describe the design of our federated task processing system. Originally, the system was created to support two specific federated tasks: evaluation and tuning of on-device ML systems, primarily for the purpose of personalizing these systems. In recent years, support for an additional federated task has been added: federated learning (FL) of deep neural networks. To our knowledge, only one other system has been described in literature that supports FL at scale. We include comparisons to that system to help discuss design decisions and attached trade-offs. Finally, we describe two specific large scale personalization use cases in detail to showcase the applicability of federated tuning to on-device personalization and to highlight application specific solutions.
Federated learning with differential privacy, or private federated learning, provides a strategy to train machine learning models while respecting users' privacy. However, differential privacy can disproportionately degrade the performance of the models on under-represented groups, as these parts of the distribution are difficult to learn in the presence of noise. Existing approaches for enforcing fairness in machine learning models have considered the centralized setting, in which the algorithm has access to the users' data. This paper introduces an algorithm to enforce group fairness in private federated learning, where users' data does not leave their devices. First, the paper extends the modified method of differential multipliers to empirical risk minimization with fairness constraints, thus providing an algorithm to enforce fairness in the central setting. Then, this algorithm is extended to the private federated learning setting. The proposed algorithm, FPFL, is tested on a federated version of the Adult dataset and an "unfair" version of the FEMNIST dataset. The experiments on these datasets show how private federated learning accentuates unfairness in the trained models, and how FPFL is able to mitigate such unfairness.
Speech representations learned with self-supervised learning (SSL) have the potential to significantly improve the performance of a number of audio applications, especially when availability of labeled data from the deployment domain is limited. Despite their successes, SSL training methods are compute-and memory-heavy, and require large investments in computing infrastructure, thus putting it out of the reach of most institutions. Therefore, building efficient model architectures is essential for the wide-scale adoption of SSL in speech technologies. CNN-based Acoustic Feature Extractors (AFE), which are widely used as encoders of acoustic waveforms, remain one of the main efficiency bottlenecks. This work proposes replacing CNN-based AFEs with more efficient ones and demonstrates that SSL pre-training time and memory consumption can be reduced by a factor of two to three over existing methods while preserving performances in speech-, command-, and speakerrecognition tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.