Tools for robust identification of crop diseases are crucial for timely intervention by farmers to minimize yield losses. Visual diagnosis of crop diseases is time-consuming and laborious, and has become increasingly unsuitable for the needs of modern agricultural production. Recently, deep convolutional neural networks (CNNs) have been used for crop disease diagnosis due to their rapidly improving accuracy in labeling images. However, previous CNN studies have mostly used images of single leaves photographed under controlled conditions, which limits operational field use. In addition, the wide variety of available CNNs and training options raises important questions regarding optimal methods of implementation of CNNs for disease diagnosis. Here, we present an assessment of seven typical CNNs (VGG-16, Inception-v3, ResNet-50, DenseNet-121, EfficentNet-B6, ShuffleNet-v2 and MobileNetV3) based on different training strategies for the identification of wheat main leaf diseases (powdery mildew, leaf rust and stripe rust) using field images. We developed a Field-based Wheat Diseases Images (FWDI) dataset of field-acquired images to supplement the public PlantVillage dataset of individual leaves imaged under controlled conditions. We found that a transfer-learning method employing retuning of all parameters produced the highest accuracy for all CNNs. Based on this training strategy, Inception-v3 achieved the highest identification accuracy of 92.5% on the test dataset. While lightweight CNN models (e.g., ShuffleNet-v2 and MobileNetV3) had shorter processing times (<0.007 s per image) and smaller memory requirements for the model parameters (<20 MB), their accuracy was relatively low (~87%). In addition to the role of CNN architecture in controlling overall accuracy, environmental effects (e.g., residual water stains on healthy leaves) were found to cause misclassifications in the field images. Moreover, the small size of some target symptoms and the similarity of symptoms between some different diseases further reduced the accuracy. Overall, the study provides insight into the collective effects of model architecture, training strategies and input datasets on the performance of CNNs, providing guidance for robust CNN design for timely and accurate crop disease diagnosis in a real-world environment.