Wheat leaf diseases are considered to be the foremost threat to wheat yield. In the realm of crop disease detection, convolutional neural networks (CNNs) have emerged as important tools. The training strategy and the initial learning rate are key factors that impact the performance and training speed of the model in CNNs. This study employed six training strategies, including Adam, SGD, Adam + StepLR, SGD + StepLR, Warm-up + Cosine annealing + SGD, Warm-up + Cosine, and annealing + Adam, with three initial learning rates (0.05, 0.01, and 0.001). Using the wheat stripe rust, wheat powdery mildew, and healthy wheat datasets, five lightweight CNN models, namely MobileNetV3, ShuffleNetV2, GhostNet, MnasNet, and EfficientNetV2, were evaluated. The results showed that upon combining the SGD + StepLR with the initial learning rate of 0.001, the MnasNet obtained the highest recognition accuracy of 98.65%. The accuracy increased by 1.1% as compared to that obtained with the training strategy with a fixed learning rate, and the size of the parameters was only 19.09 M. The above results indicated that the MnasNet was appropriate for porting to the mobile terminal and efficient for automatically identifying wheat leaf diseases.