Tracking Beyond Detection: Learning a Global Response Map for End-to-End Multi-Object Tracking

Wan, Xingyu; Cao, Jiakai; Zhou, Sanping; Wang, Jinjun; Zheng, Nanning

doi:10.1109/tip.2021.3113169

Cited by 15 publications

(5 citation statements)

References 56 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In recent years, with the advancement of deep learning and object detection, online tracking has attracted more and more attention. On contrary to offline methods, online methods usually adopt the Hungarian algorithm for data association, but focus on the joint learning of object detection and some useful priors, such as object motions [1], [7], [35], appearance features [4], [36], [37], occlusion maps [36], object poses [38] and so on. However, except for the annotation of box and category ID, extra annotations are required for the learning of these priors, e.g., object identity for appearance feature learning.…”

Section: A Multi-object Trackingmentioning

confidence: 99%

Online Learned Siamese Network with Auto-Encoding Constraints for Robust Multi-Object Tracking

Liu

et al. 2019

Electronics

View full text Add to dashboard Cite

Multi-object tracking aims to estimate the complete trajectories of objects in a scene. Distinguishing among objects efficiently and correctly in complex environments is a challenging problem. In this paper, a Siamese network with an auto-encoding constraint is proposed to extract discriminative features from detection responses in a tracking-by-detection framework. Different from recent deep learning methods, the simple two layers stacked auto-encoder structure enables the Siamese network to operate efficiently only with small-scale online sample data. The auto-encoding constraint reduces the possibility of overfitting during small-scale sample training. Then, the proposed Siamese network is improved to extract the previous-appearance-next vector from tracklet for better association. The new feature integrates the appearance, previous, and next stage motions of an element in a tracklet. With the new features, an online incremental learned tracking framework is established. It contains reliable tracklet generation, data association to generate complete object trajectories, and tracklet growth to deal with missing detections and to enhance the new feature for tracklet. Benefiting from discriminative features, the final trajectories of objects can be achieved by an efficient iterative greedy algorithm. Feature experiments show that the proposed Siamese network has advantages in terms of both discrimination and correctness. The system experiments show the improved tracking performance of the proposed method.

show abstract

Section: A Multi-object Trackingmentioning

confidence: 99%

Online Learned Siamese Network with Auto-Encoding Constraints for Robust Multi-Object Tracking

Liu

et al. 2019

Electronics

View full text Add to dashboard Cite

show abstract

“…In general, the existing MOT methods either follow the tracking-by-detection [2] or tracking-by-regression [39,40,59], paradigm. The former methods first detect objects in each video frame and then associate detections between adjacent frames to create individual object tracks over time.…”

Section: Introductionmentioning

confidence: 99%

MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking

Zheng¹,

Zhou²,

Wang³

et al. 2023

Preprint

View full text Add to dashboard Cite

The main challenge of Multi-Object Tracking (MOT) lies in maintaining a continuous trajectory for each target. Existing methods often learn reliable motion patterns to match the same target between adjacent frames and discriminative appearance features to re-identify the lost targets after a long period. However, the reliability of motion prediction and the discriminability of appearances can be easily hurt by dense crowds and extreme occlusions in the tracking process. In this paper, we propose a simple yet effective multi-object tracker, i.e., MotionTrack, which learns robust short-term and long-term motions in a unified framework to associate trajectories from a short to long range. For dense crowds, we design a novel Interaction Module to learn interaction-aware motions from short-term trajectories, which can estimate the complex movement of each target. For extreme occlusions, we build a novel Refind Module to learn reliable long-term motions from the target's history trajectory, which can link the interrupted trajectory with its corresponding detection. Our Interaction Module and Refind Module are embedded in the well-known tracking-bydetection paradigm, which can work in tandem to maintain superior performance. Extensive experimental results on MOT17 and MOT20 datasets demonstrate the superiority of our approach in challenging scenarios, and it achieves state-of-the-art performances at various MOT metrics.

show abstract

“…To forecast the future location and apply the intersection over union and detection in the computation of association costs, Gao et al [22] dynamically introduce sub-networks for each instance of a person. The detection is used as the positive sample and the areas around it as the negative sample by the authors of [23] when employing the discriminative appearance learning approach for each track. Additionally, they employ spatiotemporal matching based on item size and position, and they multiply these three measurements together along with the pair-wise cost Liu et al [24] to achieve high-performance online tracking, the suggested solution is based on the Gaussian mixture probability hypothesis density (GMPHD) filter, a hierarchical data association (HDA), and a mask-based affinity fusion (MAF) model.…”

Section: Introductionmentioning

confidence: 99%

Multiple object tracking using space-time adaptive correlation tracking

Sriram,

Purushotham

2023

IJEECS

View full text Add to dashboard Cite

<span>In application of tracking and detecting the suspicious activities, multiple object tracking (MOT) has been given fine attention due to its application as it provides the parallel task of identification and tracking of human. MOT ensures the identification and trajectory for each object frame as they interact, despite the changes in its appearance, occlusion and various other tasks involved. Recent adoption of deep learning has given a new perspective but still achieving high metrics remains a major issue to overcome such issues, this research work presents the integrated architecture of deep convolutional covariance networks (DCCNs) and space-time adaptive correlation tracking (STACT) algorithm with similarity map function (SMF). Moreover, in proposed work, DCCNs is utilized for feature extractions through each frame capturing the distinctive information, STACT is tracking approaches that utilizes the SMF for locating and tracking objects. SMFs are updated for any changes in human appearances and motion, also it deals with occlusion. Here the proposed model is evaluated on MOT17 and MOT20 dataset. Performance analysis is carried out through comparing the existing model and Integrated-DCCN achieves higher metrics.</span>

show abstract

Tracking Beyond Detection: Learning a Global Response Map for End-to-End Multi-Object Tracking

Cited by 15 publications

References 56 publications

Online Learned Siamese Network with Auto-Encoding Constraints for Robust Multi-Object Tracking

Online Learned Siamese Network with Auto-Encoding Constraints for Robust Multi-Object Tracking

MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking

Multiple object tracking using space-time adaptive correlation tracking

Contact Info

Product

Resources

About