Face mask-wearing detection is of great significance for safety protection during the epidemic. Aiming at the problem of low detection accuracy due to the problems of occlusion, complex illumination, and density in mask-wearing detection, this paper proposes a neural network model based on the loss function and attention mechanism for mask-wearing detection in complex environments. Based on YOLOv5s, we first introduce an attention mechanism in the feature fusion process to improve feature utilization, study the effect of different attention mechanisms (CBAM, SE, and CA) on improving deep network models, and then explore the influence of different bounding box loss functions (GIoU, CIoU, and DIoU) on mask-wearing recognition. CIoU is used as the frame regression loss function to improve the positioning accuracy. By collecting 7,958 mask-wearing images and a large number of images of people without masks as a dataset and using YOLOv5s as the benchmark model, the mAP of the model proposed in the paper reached 90.96% on the validation set, which is significantly better than the traditional deep learning method. Mask-wearing detection is carried out in a real environment, and the experimental results of the proposed method can meet the daily detection requirements.