Fusing attention mechanism with Mask R-CNN for instance segmentation of grape cluster in the field

Shen, Lei; Su, Jinya; Quan, Wumeng; Song, Yuyang; Fang, Yulin; Su, Baofeng

doi:10.3389/fpls.2022.934450

Cited by 16 publications

(4 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Instance segmentation, one of the challenging tasks in machine vision, requires the generation of pixel-level segmentation masks for each object on the basis of image classification [ 24 ]. Different from semantic segmentation, instance segmentation needs to distinguish different instances of the same class.…”

Section: Methodsmentioning

confidence: 99%

Instance Segmentation and Berry Counting of Table Grape before Thinning Based on AS-SwinT

Du,

Liu

2023

Plant Phenomics

View full text Add to dashboard Cite

Berry thinning is one of the most important tasks in the management of high-quality table grapes. Farmers often thin the berries per cluster to a standard number by counting. With an aging population, it is hard to find adequate skilled farmers to work during thinning season. It is urgent to design an intelligent berry-thinning machine to avoid exhaustive repetitive labor. A machine vision system that can determine the number of berries removed and locate the berries removed is a challenge for the thinning machine. A method for instance segmentation of berries and berry counting in a single bunch is proposed based on AS-SwinT. In AS-SwinT, Swin Transformer is performed as the backbone to extract the rich characteristics of grape berries. An adaptive feature fusion is introduced to the neck network to sufficiently preserve the underlying features and enhance the detection of small berries. The size of berries in the dataset is statistically analyzed to optimize the anchor scale, and Soft-NMS is used to filter the candidate frames to reduce the missed detection of densely shaded berries. Finally, the proposed method could achieve 65.7 AP box , 95.0 A P 0.5 box , 57 A P s box , 62.8 AP mask , 94.3 A P 0.5 mask , 48 A P s mask , which is markedly superior to Mask R-CNN, Mask Scoring R-CNN, and Cascade Mask R-CNN. Linear regressions between predicted numbers and actual numbers are also developed to verify the precision of the proposed model. RMSE and R 2 values are 7.13 and 0.95, respectively, which are substantially higher than other models, showing the advantage of the AS-SwinT model in berry counting estimation.

show abstract

Section: Methodsmentioning

confidence: 99%

Instance Segmentation and Berry Counting of Table Grape before Thinning Based on AS-SwinT

Du,

Liu

2023

Plant Phenomics

View full text Add to dashboard Cite

show abstract

“…In order to apply existing deep learning algorithms to other domains, it is typically necessary to make improvements upon the existing algorithms, enabling the deep learning models to be better suited for application in those domains. Such as grape detection (L. Shen et al., 2022), apple detection (Wang & He, 2022), ship detection (Nie et al., 2020), tunnel surface defects (Marasco et al., 2022; Xu et al., 2021), moisture marks of shield tunnel lining (Xue & Li, 2018; Zhao et al., 2020), concrete crack detection (J. Deng et al., 2020), distance measure (Naranjo et al., 2021), facial expression (Benamara et al., 2021), multi‐object tracking (Urdiales et al., 2023), epileptic seizure detection (Nogay & Adeli, 2021), and small object detection (Yu et al., 2023). These studies primarily employed attention mechanisms and deformable convolutional networks (Dai et al., 2017) to enhance the recognition capabilities of the model.…”

Section: Related Workmentioning

confidence: 99%

Intelligent recognition of joints and fissures in tunnel faces using an improved mask region‐based convolutional neural network algorithm

Lei,

Zhang,

Deng

et al. 2023

Computer aided Civil Eng

View full text Add to dashboard Cite

To address the challenges of low recognition accuracy, low robustness, and low detection efficiency in existing tunnel face joint and fissure recognition methods, we present a deep learning recognition segmentation algorithm called the mask region convolutional neural network (Mask R‐CNN) that is enhanced by an advanced Transformer attention mechanism and deformable convolution network (Mask R‐CNN‐TD). The Transformer attention mechanism improves the backbone network's ability to extract image features by focusing on important areas. A deformable convolutional network enables the network to more precisely conform to the morphological characteristics of joints and fissures on the tunnel face, thereby enhancing the accuracy of detection. Experimental results demonstrate that Mask R‐CNN‐TD achieves superior performance, compared to Mask R‐CNN series algorithms and other instance segmentation methods in terms of detection accuracy, with mean average precision scores of 70.5%, 70.8%, 53.2%, and 63.3% for detection box and mask segmentation at thresholds of 0.5 and 0.75, respectively. Based on the stable and efficient Mask R‐CNN‐TD model, we developed a mobile application called tunnel face detector to automatically detect tunnel faces on the construction site.

show abstract

“…According to the image intersection over union, the corresponding ratio was obtained by comparing the prediction box and the real box repetition rates. Subsequently, it The main body of the network model for the Mask R-CNN algorithm was based on Faster R-CNN, with the addition of a fully convolutional network to predict the semantic segmentation [24]. First, the residual network (Res-Net) was used as the feature to extract the skeleton network, combined with a feature pyramid network (FPN) to utilize better highlevel semantic features and low-level texture features that extract multi-scale information in the image [25].The bilinear interpolation method was applied to the original region of interest (ROI) pooling to address the issue of the candidate box extraction process sampling an integer value for the tensor's sampling point [26].…”

Section: The Mask R-cnn Algorithm Modelmentioning

confidence: 99%

Deep Learning-Based Modified YOLACT Algorithm on Magnetic Resonance Imaging Images for Screening Common and Difficult Samples of Breast Cancer

Wang

2023

Diagnostics

View full text Add to dashboard Cite

Computer-aided methods have been extensively applied for diagnosing breast lesions with magnetic resonance imaging (MRI), but fully-automatic diagnosis using deep learning is rarely documented. Deep-learning-technology-based artificial intelligence (AI) was used in this work to classify and diagnose breast cancer based on MRI images. Breast cancer MRI images from the Rider Breast MRI public dataset were converted into processable joint photographic expert group (JPG) format images. The location and shape of the lesion area were labeled using the Labelme software. A difficult-sample mining mechanism was introduced to improve the performance of the YOLACT algorithm model as a modified YOLACT algorithm model. Diagnostic efficacy was compared with the Mask R-CNN algorithm model. The deep learning framework was based on PyTorch version 1.0. Four thousand and four hundred labeled data with corresponding lesions were labeled as normal samples, and 1600 images with blurred lesion areas as difficult samples. The modified YOLACT algorithm model achieved higher accuracy and better classification performance than the YOLACT model. The detection accuracy of the modified YOLACT algorithm model with the difficult-sample-mining mechanism is improved by nearly 3% for common and difficult sample images. Compared with Mask R-CNN, it is still faster in running speed, and the difference in recognition accuracy is not obvious. The modified YOLACT algorithm had a classification accuracy of 98.5% for the common sample test set and 93.6% for difficult samples. We constructed a modified YOLACT algorithm model, which is superior to the YOLACT algorithm model in diagnosis and classification accuracy.

show abstract

Fusing attention mechanism with Mask R-CNN for instance segmentation of grape cluster in the field

Cited by 16 publications

References 37 publications

Instance Segmentation and Berry Counting of Table Grape before Thinning Based on AS-SwinT

Instance Segmentation and Berry Counting of Table Grape before Thinning Based on AS-SwinT

Intelligent recognition of joints and fissures in tunnel faces using an improved mask region‐based convolutional neural network algorithm

Deep Learning-Based Modified YOLACT Algorithm on Magnetic Resonance Imaging Images for Screening Common and Difficult Samples of Breast Cancer

Contact Info

Product

Resources

About