The continuous and extensive pinewood nematode disease has seriously threatened the sustainable development of forestry in China. At present, many studies have used high-resolution remote sensing images combined with a deep semantic segmentation algorithm to identify standing dead trees in the red attack period. However, due to the complex background, closely distributed detection scenes, and unbalanced training samples, it is difficult to detect standing dead trees (SDTs) in a variety of complex scenes by using conventional segmentation models. In order to further solve the above problems and improve the recognition accuracy, we proposed a new detection method called multi-scale spatial supervision convolutional network (MSSCN) to identify SDTs in a wide range of complex scenes based on airborne remote sensing imagery. In the method, a Gaussian kernel approach was used to generate a confidence map from SDTs marked as points for training samples, and a multi-scale spatial attention block was added into fully convolutional neural networks to reduce the loss of spatial information. Further, an augmentation strategy called copy–pasting was used to overcome the lack of efficient samples in this research area. Validation at four different forest areas belonging to two forest types and two diseased outbreak intensities showed that (1) the copy–pasting method helps to augment training samples and can improve the detecting accuracy with a suitable oversampling rate, and the best oversampling rate should be carefully determined by the input training samples and image data. (2) Based on the two-dimensional spatial Gaussian kernel distribution function and the multi-scale spatial attention structure, the MSSCN model can effectively find the dead tree extent in a confidence map, and by following this with maximum location searching we can easily locate the individual dead trees. The averaged precision, recall, and F1-score across different forest types and disease-outbreak-intensity areas can achieve 0.94, 0.84, and 0.89, respectively, which is the best performance among FCN8s and U-Net. (3) In terms of forest type and outbreak intensity, the MSSCN performs best in pure pine forest type and low-outbreak-intensity areas. Compared with FCN8s and U-Net, the MSSCN can achieve the best recall accuracy in all forest types and outbreak-intensity areas. Meanwhile, the precision metric is also maintained at a high level, which means that the proposed method provides a trade-off between the precision and recall in detection accuracy.