Auto-Selecting Receptive Field Network for Visual Tracking

Zhuang, Junfei; Dong, Yuan; Bai, Hongliang; Zuo, Peiliang; Cheng, Jianming

doi:10.1109/access.2019.2947472

Cited by 7 publications

(4 citation statements)

References 47 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Therefore, it becomes crucial to carefully replace the traditional convolutions with appropriate alternatives. Additionally, experiments have shown that introducing attention mechanisms [ 50 ], image enhancement methods [ 51 ], and enlarging the receptive field [ 52 ] can effectively improve detection performance. Nevertheless, these approaches come with the drawback of increased parameters and the calculation amount, making the deployment of the model challenging.…”

Section: Discussionmentioning

confidence: 99%

Tea-YOLOv8s: A Tea Bud Detection Model Based on Deep Learning and Computer Vision

Xie

Sun

2023

Sensors

View full text Add to dashboard Cite

Tea bud target detection is essential for mechanized selective harvesting. To address the challenges of low detection precision caused by the complex backgrounds of tea leaves, this paper introduces a novel model called Tea-YOLOv8s. First, multiple data augmentation techniques are employed to increase the amount of information in the images and improve their quality. Then, the Tea-YOLOv8s model combines deformable convolutions, attention mechanisms, and improved spatial pyramid pooling, thereby enhancing the model’s ability to learn complex object invariance, reducing interference from irrelevant factors, and enabling multi-feature fusion, resulting in improved detection precision. Finally, the improved YOLOv8 model is compared with other models to validate the effectiveness of the proposed improvements. The research results demonstrate that the Tea-YOLOv8s model achieves a mean average precision of 88.27% and an inference time of 37.1 ms, with an increase in the parameters and calculation amount by 15.4 M and 17.5 G, respectively. In conclusion, although the proposed approach increases the model’s parameters and calculation amount, it significantly improves various aspects compared to mainstream YOLO detection models and has the potential to be applied to tea buds picked by mechanization equipment.

show abstract

Section: Discussionmentioning

confidence: 99%

Tea-YOLOv8s: A Tea Bud Detection Model Based on Deep Learning and Computer Vision

Xie

Sun

2023

Sensors

View full text Add to dashboard Cite

show abstract

“…Therefore, avoiding repetitive convolution is necessary for small object detection. The research conducted by Zhuang [23] et al showed that the receptive field is a key factor that affects the performance of a CNN, and increasing the receptive field helps to improve the classification task. Lei [24] et al demonstrated that dilated convolution can increase the perceptual field sizes of network layers and effectively expand the corresponding receptive field while retaining the valuable contextual information, providing stronger feature semantic information for the network.…”

Section: Dilated Convolutionmentioning

confidence: 99%

An improved YOLO algorithm with multisensing for pedestrian detection

Gong,

Wang,

Huang

et al. 2024

SIViP

View full text Add to dashboard Cite

Although pedestrian detection techniques are improving, this task is still challenging due to the problems of target occlusion, small targets, and complex pedestrian backgrounds in images of different scenes. As a result, the You Only Look Once (YOLO) algorithm exhibits lower detection accuracy. In this paper, the use of multiple dilated convolutions to sample feature images is proposed avoid the information loss incurred repeated sampling, to improve the feature extraction and target detection performance of the algorithm.In addition, a lightweight shuffle-based efficient channel attention (SECA) mechanism is introduced to conduct grouping in the channel dimension and perform parallel processing for each subfeature map channel. A new branch is introduced to enrich the channel feature information for multiscale feature representation. Finally, a distance intersection over union-based nonmaximum suppression (DIoU-NMS) method is introduced to minimize the occurrence of missed targets due to occlusion by taking the prediction box and ground truth box centroid locations information into account without increasing the computational cost over that of normal NMS. Our method is extensively evaluated on several challenging pedestrian detection datasets, achieving 87.73%, 34.7%, 93.96% and 95.23% mean average precision (mAP) values on PASCAL VOC 2012, MS COCO, Caltech Pedestrian and INRIA Person, which are respectively. The experimental results demonstrate the effectiveness of the method.

show abstract

“…At each time slot, each UAV samples a receiver of its transmission from this distribution. Especially, the number of GN operations determines the receptive field of GNN and how far packets can travel along edges in the network, selecting an appropriate receptive field will improve the performance of the method [38]. The receptive field refers to the specific region in the input space that a neuron or a group of neurons in a neural network is sensitive to.…”

Section: Graph Network Blockmentioning

confidence: 99%

Unmanned Aerial Vehicle Cooperative Data Dissemination Based on Graph Neural Networks

Xing,

Zhang,

Wang

et al. 2024

Sensors

View full text Add to dashboard Cite

Unmanned Aerial Vehicles (UAVs) have critical applications in various real-world scenarios, including mapping unknown environments, military reconnaissance, and post-disaster search and rescue. In these scenarios where communication infrastructure is missing, UAVs will form an ad hoc network and perform tasks in a distributed manner. To efficiently carry out tasks, each UAV must acquire and share global status information and data from neighbors. Meanwhile, UAVs frequently operate in extreme conditions, including storms, lightning, and mountainous areas, which significantly degrade the quality of wireless communication. Additionally, the mobility of UAVs leads to dynamic changes in network topology. Therefore, we propose a method that utilizes graph neural networks (GNN) to learn cooperative data dissemination. This method leverages the network topology relationship and enables UAVs to learn a decision policy based on local data structure, ensuring that all UAVs can recover global information. We train the policy using reinforcement learning that enhances the effectiveness of each transmission. After repeated simulations, the results validate the effectiveness and generalization of the proposed method.

show abstract

Auto-Selecting Receptive Field Network for Visual Tracking

Cited by 7 publications

References 47 publications

Tea-YOLOv8s: A Tea Bud Detection Model Based on Deep Learning and Computer Vision

Tea-YOLOv8s: A Tea Bud Detection Model Based on Deep Learning and Computer Vision

An improved YOLO algorithm with multisensing for pedestrian detection

Unmanned Aerial Vehicle Cooperative Data Dissemination Based on Graph Neural Networks

Contact Info

Product

Resources

About