Extracting high-accuracy landslide areas using deep learning methods from high spatial resolution remote sensing images is a hot topic in current research. However, the existing deep learning algorithms are affected by background noise and landslide scale effects during the extraction process, leading to poor feature extraction effects. To address this issue, this paper proposes an improved mask regions-based convolutional neural network (Mask R-CNN) model to identify the landslide distribution in unmanned aerial vehicles (UAV) images. The improvement of the model mainly includes three aspects: (1) an attention mechanism of the convolutional block attention module (CBAM) is added to the backbone residual neural network (ResNet). (2) A bottom-up channel is added to the feature pyramidal network (FPN) module. (3) The region proposal network (RPN) is replaced by guided anchoring (GA-RPN). Sanming City, China was selected as the study area for the experiments. The experimental results show that the improved model has a recall of 91.4% and an accuracy of 92.6%, which is 12.9% and 10.9% higher than the original Mask R-CNN model, respectively, indicating that the improved model is more effective in landslide extraction.