In the context of traditional wheat cultivation, issues such as a lack of seedlings and the prolonged monopoly of seedlings are frequently encountered. These phenomena have a considerable impact on both grain yield and the income of farmers. The traditional methods of identifying wheat seedling varieties rely on manual observation and measurement. However, these methods are time-consuming, labor-intensive, and susceptible to subjective influences, resulting in poor timeliness and robustness. The detection accuracy and speed of wheat seedling variety identification and classification can be improved by using deep learning models. However, there is still relatively little research on this subject. In this study, a McaxseNet lightweight model wheat variety identification classification method is proposed. The method is based on the MobileVit-XS network model, which efficiently identifies global feature information. The introduction of the CBAM attention mechanism in the MV2 module enables the MV2 module to be more focused and accurate when processing features. It is proposed that the XSE module incorporate the SE attention mechanism in the improved Xception module, followed by residual linking, to address the gradient vanishing problem and enhance the feature extraction capability of the model, while simultaneously improving its robustness. The McaxseNet lightweight model was trained on 30 datasets in a wheat test field, comprising a total of 29,673 images of wheat seedlings from 30 wheat varieties. The average accuracy of the dataset is 98.27%, which represents a 5.94% improvement over that of the MobileViT model. Furthermore, the model's number of parameters is only 10.51MB, and the execution time for processing a single wheat seedling image is 24.1ms. In comparison to other convolutional neural network models, McaxseNet exhibits a higher degree of accuracy while maintaining a relatively low number of parameters. In comparison to other convolutional neural network models, McaxseNet exhibits a higher degree of accuracy while maintaining a relatively low number of parameters.