Deep Cosine Metric Learning for Person Re-identification

Wojke, Nicolai; Bewley, Alex

doi:10.1109/wacv.2018.00087

Cited by 324 publications

(192 citation statements)

References 28 publications

Supporting

Mentioning

190

Contrasting

Unclassified

Order By: Relevance

“…With regards to the baseline [51], on MARS, our PDSR technique allows an overall improvement of +3.2% and +4% respectively for the rank-1 accuracy and the mAP Table 3. This improvement is due in part to the weighted fusion strategy WF (+1.6% and +2.9% over the baseline, respectively, for rank 1 accuracy and mAP) and for the remaining part to the WPR technique (+1.6% and +1.1% for rank-1 accuracy and mAP).…”

Section: Ablation Analysismentioning

confidence: 98%

“…The importance of accounting for the pose/viewpoint invariance problem, in person re-id, has been amply proved by many works. A popular approach is metric learning [61,58,72,22,51,48] where a similarity metric is learned in the space of the video-level feature vectors expressing different views, aiming to increase the intra-class compactness and the inter-class distance of the identities. A different stream of research, complementary to metric learning, tackles the viewpoint problem by focusing on designing/learning more robust feature representations, for example, exploiting the temporal aggregation of multiple frame-level features maps [30] or performing spatial fusion/concatenation of global/local features [56,4].…”

Section: Cross-view Invariant Techniquesmentioning

confidence: 99%

“…This is why it lends itself particularly well to be combined with the existing video-based feature extraction techniques for producing even more discriminative embeddings. In this paper, we follow this combined approach for complementing the incomplete pose information of video-sequences and refer to [38] for the GAN architecture and to [51] for the feature extraction CNN.…”

Section: Cross-view Invariant Techniquesmentioning

confidence: 99%

“…Residual learning-based CNN architecture[51] used for feature extraction. The patch size for the convolutional, pooling and residual layers is 3 × 3.a training set and a testing set, each one with 702 IDs.…”

mentioning

confidence: 99%

See 3 more Smart Citations

GAN-Based Pose-Aware Regulation for Video-Based Person Re-Identification

Borgia¹,

Yang

Kodirov³

et al. 2019

2019 IEEE Winter Conference on Applications of Computer Vision (WACV)

View full text Add to dashboard Cite

Video-based person re-identification deals with the inherent difficulty of matching unregulated sequences with different length and with incomplete target pose/viewpoint structure. Common approaches operate either by reducing the problem to the still images case, facing a significant information loss, or by exploiting inter-sequence temporal dependencies as in Siamese Recurrent Neural Networks or in gait analysis. However, in all cases, the intersequences pose/viewpoint misalignment is not considered, and the existing spatial approaches are mostly limited to the still images context. To this end, we propose a novel approach that can exploit more effectively the rich video information, by accounting for the role that the changing pose/viewpoint factor plays in the sequences matching process. Specifically, our approach consists of two components. The first one attempts to complement the original pose-incomplete information carried by the sequences with synthetic GAN-generated images, and fuse their feature vectors into a more discriminative viewpointinsensitive embedding, namely Weighted Fusion (WF). Another one performs an explicit pose-based alignment of sequence pairs to promote coherent feature matching, namely Weighted-Pose Regulation (WPR). Extensive experiments on two large video-based benchmark datasets show that our approach outperforms considerably existing methods.

show abstract

Section: Ablation Analysismentioning

confidence: 98%

Section: Cross-view Invariant Techniquesmentioning

confidence: 99%

Section: Cross-view Invariant Techniquesmentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

GAN-Based Pose-Aware Regulation for Video-Based Person Re-Identification

Borgia¹,

Yang

Kodirov³

et al. 2019

2019 IEEE Winter Conference on Applications of Computer Vision (WACV)

View full text Add to dashboard Cite

show abstract

“…• For the tracking model, we train the deep association network [46] on the object hypotheses generated from the detection module and feed it to the the deep sort algorithm [47] for tracking.…”

Section: Introductionmentioning

confidence: 99%

Aerial Multi-Object Tracking by Detection Using Deep Association Networks

Jadhav

Mukherjee

Kaushik

et al. 2020

2020 National Conference on Communications (NCC)

View full text Add to dashboard Cite

A lot a research is focused on object detection and it has achieved significant advances with deep learning techniques in recent years. Inspite of the existing research, these algorithms are not usually optimal for dealing with sequences or images captured by drone-based platforms, due to various challenges such as view point change, scales, density of object distribution and occlusion. In this paper, we develop a model for detection of objects in drone images using the VisDrone2019 DET dataset. Using the RetinaNet model as our base, we modify the anchor scales to better handle the detection of dense distribution and small size of the objects. We explicitly model the channel interdependencies by using Squeeze-and-Excitation (SE) blocks that adaptively recalibrates channel-wise feature responses. This helps to bring significant improvements in performance at a slight additional computational cost. Using this architecture for object detection, we build a custom DeepSORT network for object detection on the VisDrone2019 MOT dataset by training a custom Deep Association network for the algorithm.

show abstract