2017
DOI: 10.1109/tip.2017.2700762
|View full text |Cite
|
Sign up to set email alerts
|

End-to-End Comparative Attention Networks for Person Re-Identification

Abstract: Abstract-Person re-identification across disjoint camera views has been widely applied in video surveillance yet it is still a challenging problem. One of the major challenges lies in the lack of spatial and temporal cues, which makes it difficult to deal with large variations of lighting conditions, viewing angles, body poses and occlusions. Recently, several deep learning based person re-identification approaches have been proposed and achieved remarkable performance. However, most of those approaches extrac… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
322
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 562 publications
(323 citation statements)
references
References 70 publications
(105 reference statements)
1
322
0
Order By: Relevance
“…Only K-LFDA when trained with mom LE [24] feature attains comparable performance than DMN. However, motivated to resolve the challenges for reidentification in real world (i.e., multimodal image space, and diverse impostors) IRM3 + CVI ( = 15) has much better results than MCP-CNN [39], E2E-CAN [31], Quadruplet-Net [33], and JLML [34], while our IRM3 + CVI ( = 15) has 1.49% higher rank@1 than DLPA [32]. DLPA extracts deep features by semantically aligning body parts, as well as rectifying pose variations.…”
Section: Results On Cuhk01mentioning
confidence: 99%
“…Only K-LFDA when trained with mom LE [24] feature attains comparable performance than DMN. However, motivated to resolve the challenges for reidentification in real world (i.e., multimodal image space, and diverse impostors) IRM3 + CVI ( = 15) has much better results than MCP-CNN [39], E2E-CAN [31], Quadruplet-Net [33], and JLML [34], while our IRM3 + CVI ( = 15) has 1.49% higher rank@1 than DLPA [32]. DLPA extracts deep features by semantically aligning body parts, as well as rectifying pose variations.…”
Section: Results On Cuhk01mentioning
confidence: 99%
“…Deep learning is very extensive. Some works [15] [16] [17] also propose person re-identification methods based on deep learning, which design the structure of deep learning network to improve the performance of their methods. Multi-task learning is raised because we focus on a single task, and ignore other information that might help optimize metrics.…”
Section: Related Workmentioning
confidence: 99%
“…These models assign weights to different parts of each frame, making some of them more important than others. In particular, [12] proposes integrating a spatial attention based model in a siamese network to adaptively focus on the important local parts of an input image pair.…”
Section: Related Workmentioning
confidence: 99%
“…Nevertheless there are relatively few attempts to use Attention in the field of Automatic Re-Identification. [12] proposes integrating a soft attention based model in a Siamese network to focus adaptively on the important local regions of an input image pair. [13] uses a spatial pyramid layer as the component attentive spatial pooling to select important regions in spatial dimension.…”
Section: Introductionmentioning
confidence: 99%