With the global pandemic of COVID-19, masks have become essential items in public places, posing challenges to security and convenience facilities based on facial recognition technology, such as access control systems and payment systems. In existing solutions, gender constraints help improve the accuracy of face mask segmentation, but in some special cases, such as transgender people and individuals with ambiguous gender expressions, it may lead to gender misjudgment, affecting the segmentation results. Deep learning methods may increase computational complexity, impacting real-time performance. In scenarios where a large number of images need to be processed quickly, these methods may not meet real-time requirements. Therefore, this paper studies the face mask segmentation method combining salient features and gender constraints. To enable the model to perform real-time face detection on hardware platforms, we introduce depthwise separable convolution to optimize the multi-task cascaded convolutional neural network structure, accomplishing the face detection task that combines salient features and gender constraints. The extraction of the face mask region is completed, and the technical steps for face mask extraction based on spectral features are provided. Experimental results verify the effectiveness of the constructed model.