“…Moreover, largecohort training can introduce fundamental optimization and generalization issues. Our results are reminiscent of work on large-batch training in centralized settings, where larger batches can stagnate convergence improvements (Dean et al, 2012;You et al, 2017;Golmant et al, 2018;McCandlish et al, 2018;Yin et al, 2018), and even lead to generalization issues with deep neural networks (Shallue et al, 2019;Ma et al, 2018;Keskar et al, 2017;Hoffer et al, 2017;Masters and Luschi, 2018;Lin et al, 2019Lin et al, , 2020. While some of the challenges we identify with large-cohort training are parallel to issues that arise in large-batch centralized learning, others are unique to federated learning and have not been previously identified in the literature.…”