DIC-Transformer: interpretation of plant disease classification results using image caption generation technology

Zeng, Qingtian; Sun, Jian; Wang, Shansong

doi:10.3389/fpls.2023.1273029

Cited by 3 publications

(1 citation statement)

References 71 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Comparative analysis against state-of-the-art models revealed that the improved MobileNetV2 surpassed its counterparts in terms of accuracy, recall, F1 score, and overall accuracy. Zeng et al (2024) proposed the DIC-Transformer model, which first detects disease areas using Faster R-CNN combined with Swin Transformer and generates disease image feature vectors, then employs a Transformer to generate image descriptions, enhancing subsequent classifier decoder performance through weighted fusion of text features with image feature vectors. Experiments showed that DIC-Transformer outperforms other comparative models in both classification and description generation tasks.…”

Section: Introductionmentioning

confidence: 99%

Plant leaf disease recognition based on improved SinGAN and improved ResNet34

Chen,

Hu,

Yang

2024

Front. Artif. Intell.

View full text Add to dashboard Cite

The identification of plant leaf diseases is crucial in precision agriculture, playing a pivotal role in advancing the modernization of agriculture. Timely detection and diagnosis of leaf diseases for preventive measures significantly contribute to enhancing both the quantity and quality of agricultural products, thereby fostering the in-depth development of precision agriculture. However, despite the rapid development of research on plant leaf disease identification, it still faces challenges such as insufficient agricultural datasets and the problem of deep learning-based disease identification models having numerous training parameters and insufficient accuracy. This paper proposes a plant leaf disease identification method based on improved SinGAN and improved ResNet34 to address the aforementioned issues. Firstly, an improved SinGAN called Reconstruction-Based Single Image Generation Network (ReSinGN) is proposed for image enhancement. This network accelerates model training speed by using an autoencoder to replace the GAN in the SinGAN and incorporates a Convolutional Block Attention Module (CBAM) into the autoencoder to more accurately capture important features and structural information in the images. Random pixel Shuffling are introduced in ReSinGN to enable the model to learn richer data representations, further enhancing the quality of generated images. Secondly, an improved ResNet34 is proposed for plant leaf disease identification. This involves adding CBAM modules to the ResNet34 to alleviate the limitations of parameter sharing, replacing the ReLU activation function with LeakyReLU activation function to address the problem of neuron death, and utilizing transfer learning-based training methods to accelerate network training speed. This paper takes tomato leaf diseases as the experimental subject, and the experimental results demonstrate that: (1) ReSinGN generates high-quality images at least 44.6 times faster in training speed compared to SinGAN. (2) The Tenengrad score of images generated by the ReSinGN model is 67.3, which is improved by 30.2 compared to the SinGAN, resulting in clearer images. (3) ReSinGN model with random pixel Shuffling outperforms SinGAN in both image clarity and distortion, achieving the optimal balance between image clarity and distortion. (4) The improved ResNet34 achieved an average recognition accuracy, recognition precision, recognition accuracy (redundant as it’s similar to precision), recall, and F1 score of 98.57, 96.57, 98.68, 97.7, and 98.17%, respectively, for tomato leaf disease identification. Compared to the original ResNet34, this represents enhancements of 3.65, 4.66, 0.88, 4.1, and 2.47%, respectively.

show abstract

Section: Introductionmentioning

confidence: 99%

Plant leaf disease recognition based on improved SinGAN and improved ResNet34

Chen,

Hu,

Yang

2024

Front. Artif. Intell.

View full text Add to dashboard Cite

show abstract

Dynamic text prompt joint multimodal features for accurate plant disease image captioning

Liang,

Huang,

Wang

et al. 2024

Vis Comput

View full text Add to dashboard Cite

A Deep Learning Model for Accurate Maize Disease Detection Based on State-Space Attention and Feature Fusion

Zhu,

Yan,

et al. 2024

Plants

View full text Add to dashboard Cite

In improving agricultural yields and ensuring food security, precise detection of maize leaf diseases is of great importance. Traditional disease detection methods show limited performance in complex environments, making it challenging to meet the demands for precise detection in modern agriculture. This paper proposes a maize leaf disease detection model based on a state-space attention mechanism, aiming to effectively utilize the spatiotemporal characteristics of maize leaf diseases to achieve efficient and accurate detection. The model introduces a state-space attention mechanism combined with a multi-scale feature fusion module to capture the spatial distribution and dynamic development of maize diseases. In experimental comparisons, the proposed model demonstrates superior performance in the task of maize disease detection, achieving a precision, recall, accuracy, and F1 score of 0.94. Compared with baseline models such as AlexNet, GoogLeNet, ResNet, EfficientNet, and ViT, the proposed method achieves a precision of 0.95, with the other metrics also reaching 0.94, showing significant improvement. Additionally, ablation experiments verify the impact of different attention mechanisms and loss functions on model performance. The standard self-attention model achieved a precision, recall, accuracy, and F1 score of 0.74, 0.70, 0.72, and 0.72, respectively. The Convolutional Block Attention Module (CBAM) showed a precision of 0.87, recall of 0.83, accuracy of 0.85, and F1 score of 0.85, while the state-space attention module achieved a precision of 0.95, with the other metrics also at 0.94. In terms of loss functions, cross-entropy loss showed a precision, recall, accuracy, and F1 score of 0.69, 0.65, 0.67, and 0.67, respectively. Focal loss showed a precision of 0.83, recall of 0.80, accuracy of 0.81, and F1 score of 0.81. State-space loss demonstrated the best performance in these experiments, achieving a precision of 0.95, with recall, accuracy, and F1 score all at 0.94. These results indicate that the model based on the state-space attention mechanism achieves higher detection accuracy and better generalization ability in the task of maize leaf disease detection, effectively improving the accuracy and efficiency of disease recognition and providing strong technical support for the early diagnosis and management of maize diseases. Future work will focus on further optimizing the model’s spatiotemporal feature modeling capabilities and exploring multi-modal data fusion to enhance the model’s application in real agricultural scenarios.

show abstract

DIC-Transformer: interpretation of plant disease classification results using image caption generation technology

Cited by 3 publications

References 71 publications

Plant leaf disease recognition based on improved SinGAN and improved ResNet34

Plant leaf disease recognition based on improved SinGAN and improved ResNet34

Dynamic text prompt joint multimodal features for accurate plant disease image captioning

A Deep Learning Model for Accurate Maize Disease Detection Based on State-Space Attention and Feature Fusion

Contact Info

Product

Resources

About