2022
DOI: 10.3390/rs14163902
|View full text |Cite
|
Sign up to set email alerts
|

Pomelo Tree Detection Method Based on Attention Mechanism and Cross-Layer Feature Fusion

Abstract: Deep learning is the subject of increasing research for fruit tree detection. Previously developed deep-learning-based models are either too large to perform real-time tasks or too small to extract good enough features. Moreover, there has been scarce research on the detection of pomelo trees. This paper proposes a pomelo tree-detection method that introduces the attention mechanism and a Ghost module into the lightweight model network, as well as a feature-fusion module to improve the feature-extraction abili… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 10 publications
(9 citation statements)
references
References 58 publications
2
7
0
Order By: Relevance
“…As mentioned in Section 1, in our previous study, we proposed a YOLOx-nano pomelo tree detection method based on an attention mechanism and cross-layer feature fusion and showed that this method was more suitable for pomelo tree detection than was other state-of-the-art object detection algorithms. The present study showed that YOLOv5s and its attention-optimized models can detect IPTs with high accuracy, in line with the results of Yuan et al [49]. Although the structure of the network proposed by Yuan et al [49] was lightweight, the AP value reached 93.74%.…”
Section: Comparison With Other Related Worksupporting
confidence: 90%
See 2 more Smart Citations
“…As mentioned in Section 1, in our previous study, we proposed a YOLOx-nano pomelo tree detection method based on an attention mechanism and cross-layer feature fusion and showed that this method was more suitable for pomelo tree detection than was other state-of-the-art object detection algorithms. The present study showed that YOLOv5s and its attention-optimized models can detect IPTs with high accuracy, in line with the results of Yuan et al [49]. Although the structure of the network proposed by Yuan et al [49] was lightweight, the AP value reached 93.74%.…”
Section: Comparison With Other Related Worksupporting
confidence: 90%
“…The present study showed that YOLOv5s and its attention-optimized models can detect IPTs with high accuracy, in line with the results of Yuan et al [49]. Although the structure of the network proposed by Yuan et al [49] was lightweight, the AP value reached 93.74%. Our optimized YOLOv5s models fully outperformed this network in terms of AP, which were all higher than 94.00%, with the highest AP value of 94.50%.…”
Section: Comparison With Other Related Worksupporting
confidence: 90%
See 1 more Smart Citation
“…In the training process of large-scale remote sensing image object extraction algorithms, CNNs are usually used. Due to the limited size of shared convolutional kernels involved in network operations and not changing based on the size of the extracted object in the task, its global modeling ability is limited, which weakens the connection between the object to be extracted and its background in the image, resulting in the loss of some implicit spatial relationship information 52 , 53 . In the process of purifying and fusing feature information using feature extraction networks, the number of pixels per unit area decreases exponentially with the size of the feature map, resulting in an increase in the object information represented by a single pixel.…”
Section: Methodsmentioning
confidence: 99%
“…Due to the limited size of shared convolutional kernels involved in network operations and not changing based on the size of the extracted object in the task, its global modeling ability is limited, which weakens the connection between the object to be extracted and its background in the image, resulting in the loss of some implicit spatial relationship information. 52,53 In the process of purifying and fusing feature information using feature extraction networks, the number of pixels per unit area decreases exponentially with the size of the feature map, resulting in an increase in the object information represented by a single pixel. When the objects are densely arranged, due to the continuous feature information abstraction, interference and mosaic phenomena will appear between the feature information of adjacent objects, resulting in dense object features being difficult to distinguish and interfering with each other.…”
Section: Dual Attention Mechanism Modulementioning
confidence: 99%