Ziming Zhu scite author profile

Multiple object tracking (MOT) in unmanned aerial vehicle (UAV) videos is a fundamental task and can be applied in many fields. MOT consists of two critical procedures, i.e., object detection and re-identification (ReID). One-shot MOT, which incorporates detection and ReID in a unified network, has gained attention due to its fast inference speed. It significantly reduces the computational overhead by making two subtasks share features. However, most existing one-shot trackers struggle to achieve robust tracking in UAV videos. We observe that the essential difference between detection and ReID leads to an optimization contradiction within one-shot networks. To alleviate this contradiction, we propose a novel feature decoupling network (FDN) to convert shared features into detection-specific and ReID-specific representations. The FDN searches for characteristics and commonalities between the two tasks to synergize detection and ReID. In addition, existing one-shot trackers struggle to locate small targets in UAV videos. Therefore, we design a pyramid transformer encoder (PTE) to enrich the semantic information of the resulting detection-specific representations. By learning scale-aware fine-grained features, the PTE empowers our tracker to locate targets in UAV videos accurately. Extensive experiments on VisDrone2021 and UAVDT benchmarks demonstrate that our tracker achieves state-of-the-art tracking performance.

show abstract

Research on video-based character recognition method for train cargo cars

Zhao

Zhang

et al. 2022

View full text Add to dashboard Cite

Leveraging temporal-aware fine-grained features for robust multiple object tracking

Nie

Zhu

et al. 2022

J Supercomput

View full text Add to dashboard Cite

MSA-MOT: Multi-Stage Association for 3D Multimodality Multi-Object Tracking

Zhu

Nie

et al. 2022

Sensors

View full text Add to dashboard Cite

Three-dimensional multimodality multi-object tracking has attracted great attention due to the use of complementary information. However, such a framework generally adopts a one-stage association approach, which fails to perform precise matching between detections and tracklets, and, thus, cannot robustly track objects in complex scenes. To address this matching problem caused by one-stage association, we propose a novel multi-stage association method, which consists of a hierarchical matching module and a customized track management module. Specifically, the hierarchical matching module defines the reliability of the objects by associating multimodal detections, and matches detections with trajectories based on the reliability in turn, which increases the utilization of true detections, and, thus, guides accurate association. Then, based on the reliability of the trajectories provided by the matching module, the customized track management module sets maximum missing frames with differences for tracks, which decreases the number of identity switches of the same object and, thus, further improves the association accuracy. By using the proposed multi-stage association method, we develop a tracker called MSA-MOT for the 3D multi-object tracking task, alleviating the inherent matching problem in one-stage association. Extensive experiments are conducted on the challenging KITTI benchmark, and the results show that our tracker outperforms the previous state-of-the-art methods in terms of both accuracy and speed. Moreover, the ablation and exploration analysis results demonstrate the effectiveness of the proposed multi-stage association method.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ziming Zhu

Learning task-specific discriminative representations for multiple object tracking

One-Shot Multiple Object Tracking in UAV Videos Using Task-Specific Fine-Grained Features

Research on video-based character recognition method for train cargo cars

Leveraging temporal-aware fine-grained features for robust multiple object tracking

MSA-MOT: Multi-Stage Association for 3D Multimodality Multi-Object Tracking

Contact Info

Product

Resources

About