Flow guided mutual attention for person re-identification

Kiran, Madhu; Bhuiyan, Amran; Nguyen-Meidine, Le Thanh; Blais-Morin, Louis-Antoine; Ayed, Ismail Ben; Granger, Éric

doi:10.1016/j.imavis.2021.104246

Cited by 8 publications

(11 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A gated attention network typically receives a gating signal from another module that provides contextual information (Kiran et al, 2021;Bhuiyan et al, 2020;Subramaniam et al, 2019). We here propose to use a gating signal from one modality to act on the backbone of the other modality during training and thereby especially reinforce the features needed for cross-modal matching.…”

Section: Gated Attention Networkmentioning

confidence: 99%

“…To increase the feature granularity, there is a common trend to use the attention mechanism to address the issue of misalignment in reidentifications. Inspired by the recent success of the gated attention mechanism (Kiran et al, 2021;Bhuiyan et al, 2020;Subramaniam et al, 2019), we propose to additionally integrate a cross-modal gated attention mechanism to mitigate the misalignment issue by dynamically selecting the CNN filters. Most of these state-of-the-art approaches use different contextual information to gate the backbone architecture.…”

Section: Introductionmentioning

confidence: 99%

“…Most of these state-of-the-art approaches use different contextual information to gate the backbone architecture. For instance, Kiran et al (2021) uses optical flow, Subramaniam et al (2019) uses co-segmentation and Bhuiyan et al (2020) uses pose guided contextual information. Unlike (Kiran et al, 2021;Bhuiyan et al, 2020) and Subramaniam et al (2019), we introduce the use of cross-modal contextual information, i.e the contextual information from one modality is processed to gate the backbone architecture of another modality.…”

Section: Introductionmentioning

confidence: 99%

“…For instance, Kiran et al (2021) uses optical flow, Subramaniam et al (2019) uses co-segmentation and Bhuiyan et al (2020) uses pose guided contextual information. Unlike (Kiran et al, 2021;Bhuiyan et al, 2020) and Subramaniam et al (2019), we introduce the use of cross-modal contextual information, i.e the contextual information from one modality is processed to gate the backbone architecture of another modality. Following the common trend in Kiran et al (2021), Bhuiyan et al (2020) and Subramaniam et al (2019), we rely on a simple gated attention mechanism which allows for multiplicative interaction between the input features from one modality and the attention map from another modality.…”

Section: Introductionmentioning

confidence: 99%

“…Unlike (Kiran et al, 2021;Bhuiyan et al, 2020) and Subramaniam et al (2019), we introduce the use of cross-modal contextual information, i.e the contextual information from one modality is processed to gate the backbone architecture of another modality. Following the common trend in Kiran et al (2021), Bhuiyan et al (2020) and Subramaniam et al (2019), we rely on a simple gated attention mechanism which allows for multiplicative interaction between the input features from one modality and the attention map from another modality. This attention is applied into midlevel layer of the respective CNN stream that provide back-propagated gradients corresponding to the amplified local similarities.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations