Accurate and efficient post-earthquake building damage assessment methods enable key building damage information to be obtained more quickly after an earthquake, providing strong support for rescue and reconstruction efforts. Although many methods have been proposed, most have limited effect on accurately extracting severely damaged and collapsed buildings, and they cannot meet the needs of emergency response and rescue operations. Therefore, in this paper, we develop a novel building damage heterogeneity enhancement network for pixel-level building damage classification of post-earthquake unmanned aerial vehicle (UAV) and remote sensing data. The proposed BDHE-Net includes the following three modules: a data augmentation module (DAM), a building damage attention module (BDAM), and a multilevel feature adaptive fusion module (MFAF), which are used to alleviate the weight deviation of intact and slightly damaged categories during model training, pay attention to the heterogeneous characteristics of damaged buildings, and enhance the extraction of house integrity contour information at different resolutions of the image. In addition, a combined loss function is used to focus more attention on the small number of severely damaged and collapsed classes. The proposed model was tested on remote sensing and UAV images acquired from the Afghanistan and Baoxing earthquakes, and the combined loss function and the role of the three modules were studied. The results show that compared with the state-of-the-art methods, the proposed BDHE-Net achieves the best results, with an F1 score improvement of 6.19–8.22%. By integrating the DBA, BDAM, and MFAF modules and combining the loss functions, the model’s classification accuracy for severely damaged and collapsed categories can be improved.