With an increasing number of smart devices like internet of things (IoT) devices deployed in the field, offloading training of neural networks (NNs) to a central server becomes more and more infeasible. Recent efforts to improve users’ privacy have led to on-device learning emerging as an alternative. However, a model trained only on a single device, using only local data, is unlikely to reach a high accuracy. Federated learning (FL) has been introduced as a solution, offering a privacy-preserving trade-off between communication overhead and model accuracy by sharing knowledge between devices but disclosing the devices’ private data. The applicability and the benefit of applying baseline FL are, however, limited in many relevant use cases due to the heterogeneity present in such environments. In this survey, we outline the heterogeneity challenges FL has to overcome to be widely applicable in real-world applications. We especially focus on the aspect of computation heterogeneity among the participating devices and provide a comprehensive overview of recent works on heterogeneity-aware FL. We discuss two groups: works that adapt the NN architecture and works that approach heterogeneity on a system level, covering Federated Averaging (FedAvg), distillation, and split learning-based approaches, as well as synchronous and asynchronous aggregation schemes.