2022
DOI: 10.3389/fpls.2022.934450
|View full text |Cite
|
Sign up to set email alerts
|

Fusing attention mechanism with Mask R-CNN for instance segmentation of grape cluster in the field

Abstract: Accurately detecting and segmenting grape cluster in the field is fundamental for precision viticulture. In this paper, a new backbone network, ResNet50-FPN-ED, was proposed to improve Mask R-CNN instance segmentation so that the detection and segmentation performance can be improved under complex environments, cluster shape variations, leaf shading, trunk occlusion, and grapes overlapping. An Efficient Channel Attention (ECA) mechanism was first introduced in the backbone network to correct the extracted feat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(4 citation statements)
references
References 37 publications
0
4
0
Order By: Relevance
“…Instance segmentation, one of the challenging tasks in machine vision, requires the generation of pixel-level segmentation masks for each object on the basis of image classification [ 24 ]. Different from semantic segmentation, instance segmentation needs to distinguish different instances of the same class.…”
Section: Methodsmentioning
confidence: 99%
“…Instance segmentation, one of the challenging tasks in machine vision, requires the generation of pixel-level segmentation masks for each object on the basis of image classification [ 24 ]. Different from semantic segmentation, instance segmentation needs to distinguish different instances of the same class.…”
Section: Methodsmentioning
confidence: 99%
“…In order to apply existing deep learning algorithms to other domains, it is typically necessary to make improvements upon the existing algorithms, enabling the deep learning models to be better suited for application in those domains. Such as grape detection (L. Shen et al., 2022), apple detection (Wang & He, 2022), ship detection (Nie et al., 2020), tunnel surface defects (Marasco et al., 2022; Xu et al., 2021), moisture marks of shield tunnel lining (Xue & Li, 2018; Zhao et al., 2020), concrete crack detection (J. Deng et al., 2020), distance measure (Naranjo et al., 2021), facial expression (Benamara et al., 2021), multi‐object tracking (Urdiales et al., 2023), epileptic seizure detection (Nogay & Adeli, 2021), and small object detection (Yu et al., 2023). These studies primarily employed attention mechanisms and deformable convolutional networks (Dai et al., 2017) to enhance the recognition capabilities of the model.…”
Section: Related Workmentioning
confidence: 99%
“…According to the image intersection over union, the corresponding ratio was obtained by comparing the prediction box and the real box repetition rates. Subsequently, it The main body of the network model for the Mask R-CNN algorithm was based on Faster R-CNN, with the addition of a fully convolutional network to predict the semantic segmentation [24]. First, the residual network (Res-Net) was used as the feature to extract the skeleton network, combined with a feature pyramid network (FPN) to utilize better highlevel semantic features and low-level texture features that extract multi-scale information in the image [25].The bilinear interpolation method was applied to the original region of interest (ROI) pooling to address the issue of the candidate box extraction process sampling an integer value for the tensor's sampling point [26].…”
Section: The Mask R-cnn Algorithm Modelmentioning
confidence: 99%