Proceedings of the 29th ACM International Conference on Multimedia 2021
DOI: 10.1145/3474085.3475250
|View full text |Cite
|
Sign up to set email alerts
|

Towards a Unified Middle Modality Learning for Visible-Infrared Person Re-Identification

Abstract: Visible-infrared person re-identification (VI-ReID) aims to search identities of pedestrians across different spectra. In this task, one of the major challenges is the modality discrepancy between the visible (VIS) and infrared (IR) images. Some state-of-the-art methods try to design complex networks or generative methods to mitigate the modality discrepancy while ignoring the highly non-linear relationship between the two modalities of VIS and IR. In this paper, we propose a non-linear middle modality generat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
30
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 97 publications
(30 citation statements)
references
References 83 publications
0
30
0
Order By: Relevance
“…[44] propose a bi-directional top-ranking loss, which samples positive and negative pairs from different modalities and optimizes such cross-modality triplets with a bi-directional interactive iteration manner. More recently, some other works adopt adversarial training strategies to reduce the cross-modality distribution divergence in image-level [29], [30], [32], [36], [46], [49]. For a instance, they transfer stylistic properties of visible images to their infrared counterpart, with an identitypreserving constraint [30], [32] or cycle consistency [29], [36].…”
Section: B Visible-infrared Re-id Methodsmentioning
confidence: 99%
“…[44] propose a bi-directional top-ranking loss, which samples positive and negative pairs from different modalities and optimizes such cross-modality triplets with a bi-directional interactive iteration manner. More recently, some other works adopt adversarial training strategies to reduce the cross-modality distribution divergence in image-level [29], [30], [32], [36], [46], [49]. For a instance, they transfer stylistic properties of visible images to their infrared counterpart, with an identitypreserving constraint [30], [32] or cycle consistency [29], [36].…”
Section: B Visible-infrared Re-id Methodsmentioning
confidence: 99%
“…Apart from modality-translation-based Re-ID approaches, there are a few attempts [ 95 , 96 , 97 , 98 ] that introduce a third modality to reduce the modality discrepancy. The idea of using a third modality was proposed by Li et al [ 95 ], who introduced an “X” modality as a middle modality to eliminate cross-modal discrepancies.…”
Section: Cross-modal Person Re-identificationmentioning
confidence: 99%
“…Following the same pipeline, in [ 96 , 97 ], real images from both modalities were combined with ground-truth labels to generate third-modal images, which help to reduce modality-related biases. In [ 98 ], a non-linear middle modality generator was proposed that effectively projects images from both modalities onto a unified space to generate an additional modality to reduce the modality discrepancies.…”
Section: Cross-modal Person Re-identificationmentioning
confidence: 99%
“…The most popular architecture [ 5 , 35 , 48 ] is a double-stream deep network, where shallow layers are independent for learning modal-specific features and deep layers are shared for learning modal-common features. Some researchers improved the double-stream architecture via fine part alignment designs [ 40 , 49 ], attention mechanisms [ 35 , 36 ], or new neural structures, such as graph [ 27 ] and transformer [ 32 , 50 ].…”
Section: Related Workmentioning
confidence: 99%