Cervical squamous intraepithelial lesions (SILs) are precursor lesions of cervical cancer, and their accurate diagnosis enables patients to be treated before malignancy manifests. However, the identification of SILs is usually laborious and has low diagnostic consistency due to the high similarity of pathological SIL images. Although artificial intelligence (AI), especially deep learning algorithms, has drawn a lot of attention for its good performance in cervical cytology tasks, the use of AI for cervical histology is still in its early stages. The feature extraction, representation capabilities, and use of p16 immunohistochemistry (IHC) among existing models are inadequate. Therefore, in this study, we first designed a squamous epithelium segmentation algorithm and assigned the corresponding labels. Second, p16-positive area of IHC slides were extracted with Whole Image Net (WI-Net), followed by mapping the p16-positive area back to the H&E slides and generating a p16-positive mask for training. Finally, the p16-positive areas were inputted into Swin-B and ResNet-50 to classify the SILs. The dataset comprised 6171 patches from 111 patients; patches from 80% of the 90 patients were used for the training set. The accuracy of the Swin-B method for high-grade squamous intraepithelial lesion (HSIL) that we propose was 0.914 [0.889–0.928]. The ResNet-50 model for HSIL achieved an area under the receiver operating characteristic curve (AUC) of 0.935 [0.921–0.946] at the patch level, and the accuracy, sensitivity, and specificity were 0.845, 0.922, and 0.829, respectively. Therefore, our model can accurately identify HSIL, assisting the pathologist in solving actual diagnostic issues and even directing the follow-up treatment of patients.