“…With the success of the commercial application of deep learning in many fields such as computer vision [1], natural language processing [2], speech recognition [3], and language translation [4], an increasing number of models are being trained on central servers and then deployed on remote devices, often to personalize a model to a specific user's needs. Personalization requires models that can be updated inexpensively by minimizing the number of parameters that need to be stored and/or transmitted and frequently calls for few-shot learning methods as the amount of training data from an individual user may be small [5]. At the same time, for privacy, security, and performance reasons, it can be advantageous to use federated learning where a model is trained on an array of remote devices, each with different data, and share gradient or parameter updates instead of training data with a central server [6].…”