Masks play a crucial role in preventing respiratory diseases and have diverse applications in national public health and industrial safety. Efficient mask-wearing detection systems are essential for ensuring accuracy and real-time performance. To overcome the challenges of extensive model calculations, extensive parameter volume, and complex hardware deployment in the current mask-wearing detection system, a lightweight mask detection model with improved YOLOv5 is proposed. Firstly, this study proposes a new lightweight network-EMA-FasterNet, as the backbone network of YOLOv5, which reduces the computation while preserving the information of each channel. Secondly, using Depthwise Separable Convolutions (DepthSepConv) to replace some C3 modules in the Neck further compresses the model volume, parameters, and computation. Finally, to prevent the detection of missing objects due to the removal of overlapping candidate boxes, use Soft Non-Maximum Suppression (NMS) to replace NMS. Compared with YOLOv5s, the proposed YOLOv5-S2C2 has high mAP in both Dataset I and Dataset II, while the parameters and computation of the model are reduced by about 56% and 57.0%, respectively. The volume is about 44% of the original model and the inference speed on GPU is improved by about 30%. These improvements show that the proposed model achieves a lightweight design and excellent real-time performance. It is also well suited for efficient use on hardware with limited computational resources.