2022
DOI: 10.3390/app122110741
|View full text |Cite
|
Sign up to set email alerts
|

A Review of Deep Learning-Based Visual Multi-Object Tracking Algorithms for Autonomous Driving

Abstract: Multi-target tracking, a high-level vision job in computer vision, is crucial to understanding autonomous driving surroundings. Numerous top-notch multi-object tracking algorithms have evolved in recent years as a result of deep learning’s outstanding performance in the field of visual object tracking. There have been a number of evaluations on individual sub-problems, but none that cover the challenges, datasets, and algorithms associated with visual multi-object tracking in autonomous driving scenarios. In t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
21
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 34 publications
(21 citation statements)
references
References 69 publications
0
21
0
Order By: Relevance
“…Multiple object tracking (MOT) is one of the most important problem in computer vision and has applications in areas of autonomous robotics [20,50], autonomous driving [12,24,43,52] and smart cities [8,44,52,72]. The problem consists of determining the position and identity of each object of interest (e.g.…”
Section: Introductionmentioning
confidence: 99%
“…Multiple object tracking (MOT) is one of the most important problem in computer vision and has applications in areas of autonomous robotics [20,50], autonomous driving [12,24,43,52] and smart cities [8,44,52,72]. The problem consists of determining the position and identity of each object of interest (e.g.…”
Section: Introductionmentioning
confidence: 99%
“…Currently, target detection based on deep learning can be categorized into the following two types: two-stage detection algorithms and single-stage detection algorithms. Most of the two-stage detection algorithms rely on the region candidate network to generate candidate frames, and feature extraction of the candidate frame target is carried out by a convolutional neural network [ 8 ]. For example, for a traditional deep learning network in which the feature information transfer process lacks interdependence, Ke et al [ 9 ] proposed a dense attention network structure through the introduction of dense connections as well as an attention module that enhances the detection ability of the model.…”
Section: Introductionmentioning
confidence: 99%
“…Wang et al [ 6 ] utilized radar detections to guide object search in images, while Yu et al [ 7 ] presented a high-level fusion of camera and radar sensors, leveraging deep learning for object detection and Kalman filtering for tracking. Although recent research tends to favor the elimination of radar sensors in favor of relying solely on camera sensors for advanced and autonomous driving functions, as mentioned by the authors in their review [ 8 ], it is important to note that camera sensors still have significant limitations, particularly in the distance and velocity estimation. To address these limitations and ensure better precision and completeness of detections, the integration of range sensors remains necessary.…”
Section: Introductionmentioning
confidence: 99%