In the research of computer vision, a very challenging problem is the detection of small objects. The existing detection algorithms often focus on detecting full-scale objects, without making proprietary optimization for detecting small-size objects. For small objects dense scenes, not only the accuracy is low, but also there is a certain waste of computing resources. An improved detection algorithm was proposed for small objects based on YOLOv5. By reasonably clipping the feature map output of the large object detection layer, the computing resources required by the model were significantly reduced and the model becomes more lightweight. An improved feature fusion method (PB-FPN) for small object detection based on PANet and BiFPN was proposed, which effectively increased the detection ability for small object of the algorithm. By introducing the spatial pyramid pooling (SPP) in the backbone network into the feature fusion network and connecting with the model prediction head, the performance of the algorithm was effectively enhanced. The experiments demonstrated that the improved algorithm has very good results in detection accuracy and real-time ability. Compared with the classical YOLOv5, the mAP@0.5 and mAP@0.5:0.95 of SF-YOLOv5 were increased by 1.6% and 0.8%, respectively, the number of parameters of the network were reduced by 68.2%, computational resources (FLOPs) were reduced by 12.7%, and the inferring time of the mode was reduced by 6.9%.
Pedestrian multiobject tracking is the major research branch in the field of computer vision. In complicated scenarios with frequent scale changes and occlusion, the existing multiobject tracking methods based on detection have unsatisfactory tracking accuracy because of the low robustness of reidentification. This article proposed a multiobject tracking method to improve the reidentification module in YOLOv5-DeepSORT at a more fine-grained level. The feature extraction network for the Re-ID part of this algorithm is built based on Res2Net and group convolution. This network's hierarchical connection structure effectively improved the network's ability to extract multiscale features, and at the same time increased the receptive field of each network layer. The PCB network structure with evenly divided feature maps is used in the output part of the backbone network to enhance the influence of local features on the overall network performance. Based on this, the reidentification model is trained on the public datasets Market-1501 and DukeMTMC-reID using triplet loss. ER-DeepSORT is an algorithm that combined the improved reidentification module of this article into DeepSORT. This article compared ER-DeepSORT with YOLOv5-DeepSORT under the original reidentification module to evaluate the tracking effect in MOT16 test sequence, the experimental results showed that ER-DeepSORT improved MOTA by 5.4% and MOTP by 2.2% on the Market-1501 datasets, and improved MOTA by 9.6% and MOTP by 2.7% on the DukeMTMC-reID datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.