RGB-Infrared (IR) person re-identification is very challenging due to the large cross-modality variations between RGB and IR images. The key solution is to learn aligned features to the bridge RGB and IR modalities. However, due to the lack of correspondence labels between every pair of RGB and IR images, most methods try to alleviate the variations with set-level alignment by reducing the distance between the entire RGB and IR sets. However, this set-level alignment may lead to misalignment of some instances, which limits the performance for RGB-IR Re-ID. Different from existing methods, in this paper, we propose to generate cross-modality paired-images and perform both global set-level and fine-grained instance-level alignments. Our proposed method enjoys several merits. First, our method can perform set-level alignment by disentangling modality-specific and modality-invariant features. Compared with conventional methods, ours can explicitly remove the modality-specific features and the modality variation can be better reduced. Second, given cross-modality unpaired-images of a person, our method can generate cross-modality paired images from exchanged images. With them, we can directly perform instance-level alignment by minimizing distances of every pair of images. Extensive experimental results on two standard benchmarks demonstrate that the proposed model favourably against state-of-the-art methods. Especially, on SYSU-MM01 dataset, our model can achieve a gain of 9.2% and 7.7% in terms of Rank-1 and mAP. Code is available at https://github.com/wangguanan/JSIA-ReID.
As an indispensable part in Intelligent Traffic System (ITS), the task of traffic forecasting inherently subjects to the following three challenging aspects. First, traffic data are physically associated with road networks, and thus should be formatted as traffic graphs rather than regular grid-like tensors. Second, traffic data render strong spatial dependence, which implies that the nodes in the traffic graphs usually have complex and dynamic relationships between each other. Third, traffic data demonstrate strong temporal dependence, which is crucial for traffic time series modeling. To address these issues, we propose a novel framework named Structure Learning Convolution (SLC) that enables to extend the traditional convolutional neural network (CNN) to graph domains and learn the graph structure for traffic forecasting. Technically, SLC explicitly models the structure information into the convolutional operation. Under this framework, various non-Euclidean CNN methods can be considered as particular instances of our formulation, yielding a flexible mechanism for learning on the graph. Along this technical line, two SLC modules are proposed to capture the global and local structures respectively and they are integrated to construct an end-to-end network for traffic forecasting. Additionally, in this process, Pseudo three Dimensional convolution (P3D) networks are combined with SLC to capture the temporal dependencies in traffic data. Extensively comparative experiments on six real-world datasets demonstrate our proposed approach significantly outperforms the state-of-the-art ones.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.