Hybrid pairing of the corresponding silkworm species is a pivotal link in sericulture, ensuring egg quality and directly influencing silk quantity and quality. Considering the potential of image recognition and the impact of varying pupal postures, this study used machine learning and deep learning for global modeling to identify pupae species and sex separately or simultaneously. The performance of traditional feature-based approaches, deep learning feature-based approaches, and their fusion approaches were compared. First, 3600 images of the back, abdomen, and side postures of 5 species of male and female pupae were captured. Next, six traditional descriptors, including the histogram of oriented gradients (HOG), and six deep learning descriptors, including ConvNeXt-S, were utilized to extract significant species and sex features. Finally, classification models were constructed using the multilayer perceptron (MLP), support vector machine, and random forest. The results indicate that the {HOG + ConvNeXt-S + MLP} model excelled, achieving 99.09% accuracy for separate species and sex recognition and 98.40% for simultaneous recognition, with precision–recall and receiver operating characteristic curves ranging from 0.984 to 1.0 and 0.996 to 1.0, respectively. In conclusion, it can capture subtle distinctions between pupal species and sexes and shows promise for extensive application in sericulture.