Software defect prediction is used to identify modules in software projects that may have defects. Heterogeneous Defect Prediction (HDP) establishes a cross project defect prediction model based on different software defect datasets. However, due to the heterogeneity of multi-source data, the model performance is usually not ideal. In addition, the project data holder is unwilling to disclose the data due to privacy regulations and other reasons, resulting in data islands. This paper presents a federal prototype learning based on prototype averaging (FPLPA), which combines federated learning (FL) with prototype learning for heterogeneous defect prediction. Firstly, the client used one-sided selection (OSS) algorithm to remove noise from local training data, and applied Chi-Squares Test algorithm to select the optimal subset of features. Secondly, the client constructed the convolution prototype network (CPN) to generate their own local prototypes. CPN are more robust to heterogeneous data than convolutional neural networks (CNN), while avoiding the deviation effect of class imbalances in software data. The prototype is used as the communication subject between the clients and the server. Because the local prototype is generated in an irreversible way, it can play a role of privacy protection in the communication process. Finally, the local CPN network is updated with the loss of local prototype and global prototype as regularization. We have verified on 10 projects in three public data sets (AEEEM, NASA and Relink), and the experimental results show that FPLPA is superior to other HDP solutions.INDEX TERMS Heterogeneous defect prediction, federated learning, prototype learning, data islands.