Unsupervised Cross-Spectral Stereo Matching by Learning to Synthesize

Liang, Mingyang; Guo, X.; Li, Hongsheng; Wang, Xiaogang; Song, You

doi:10.1609/aaai.v33i01.33018706

Cited by 29 publications

(10 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To estimate correspondence from stereo images with radiometric variation, different robust matching costs are proposed, such as mutual information measure [11] and adaptive normalized cross-correlation [12]. For cross-modal stereo [40,45,49], images from two different modalities are normalized to a single one to make up the photometric inconsistency, e.g., through deep transformation networks [21,49]. Recently, stereo matching with visual imbalance (monocular blur and noise) is addressed by integrating a view synthesis network and a stereo reconstruction network, which requires the ground-truth disparity, the high-quality version of the degraded view, and the explicit degradation form for supervision [25].…”

Section: Related Workmentioning

confidence: 99%

Degradation-agnostic Correspondence from Resolution-asymmetric Stereo

Chen¹,

Zhang²,

Cheng³

et al. 2022

Preprint

View full text Add to dashboard Cite

In this paper, we study the problem of stereo matching from a pair of images with different resolutions, e.g., those acquired with a tele-wide camera system. Due to the difficulty of obtaining ground-truth disparity labels in diverse real-world systems, we start from an unsupervised learning perspective. However, resolution asymmetry caused by unknown degradations between two views hinders the effectiveness of the generally assumed photometric consistency. To overcome this challenge, we propose to impose the consistency between two views in a feature space instead of the image space, named feature-metric consistency. Interestingly, we find that, although a stereo matching network trained with the photometric loss is not optimal, its feature extractor can produce degradation-agnostic and matchingspecific features. These features can then be utilized to formulate a feature-metric loss to avoid the photometric inconsistency. Moreover, we introduce a self-boosting strategy to optimize the feature extractor progressively, which further strengthens the feature-metric consistency. Experiments on both simulated datasets with various degradations and a self-collected real-world dataset validate the superior performance of the proposed method over existing solutions.

show abstract

Section: Related Workmentioning

confidence: 99%

Degradation-agnostic Correspondence from Resolution-asymmetric Stereo

Chen¹,

Zhang²,

Cheng³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Generative approaches for RGB and NIR use a cycleGAN to match across generated stereo pairs for both spectra [42], [43]. Performance of these image-to-image translation approaches are subject to spectral similarity.…”

Section: B Machine Learning and Self-supervised Trainingmentioning

confidence: 99%

“…Both neural network approaches, from Zhi et al [31] and Liang et al [42] leverage image-to-image translation. This is possible because the majority of materials in the NIR spectrum closely resemble their grayscale counterparts in the visible image.…”

Section: B Rgb-nir Evaluationmentioning

confidence: 99%

There and Back Again: Self-supervised Multispectral Correspondence Estimation

Walters¹,

Méndez²,

Johnson³

et al. 2021

Preprint

View full text Add to dashboard Cite

Across a wide range of applications, from autonomous vehicles to medical imaging, multi-spectral images provide an opportunity to extract additional information not present in color images. One of the most important steps in making this information readily available is the accurate estimation of dense correspondences between different spectra.Due to the nature of cross-spectral images, most correspondence solving techniques for the visual domain are simply not applicable. Furthermore, most cross-spectral techniques utilize spectra-specific characteristics to perform the alignment. In this work, we aim to address the dense correspondence estimation problem in a way that generalizes to more than one spectrum. We do this by introducing a novel cycle-consistency metric that allows us to self-supervise. This, combined with our spectraagnostic loss functions, allows us to train the same network across multiple spectra.We demonstrate our approach on the challenging task of dense RGB-FIR correspondence estimation. We also show the performance of our unmodified network on the cases of RGB-NIR and RGB-RGB, where we achieve higher accuracy than similar self-supervised approaches. Our work shows that crossspectral correspondence estimation can be solved in a common framework that learns to generalize alignment across spectra.

show abstract

“…Most deep stereo models are particularly data dependent and their performance drops considerably when dealing with unseen domains different from those observed during training [37,38]. To tackle the domain shift problem, two main strategies are involved: image synthesis [6,14,18], and un-/selfsupervised adaptation [3,10,25,37,38,39,40,48,49]. In contrast, our method aims at being transferred without adaptation to different domains, being this possibility more appealing for practical applications.…”

Section: Related Workmentioning

confidence: 99%

Matching-space Stereo Networks for Cross-domain Generalization

Cai

Poggi

Mattoccia

et al. 2020

Preprint

View full text Add to dashboard Cite

End-to-end deep networks represent the state of the art for stereo matching. While excelling on images framing environments similar to the training set, major drops in accuracy occur in unseen domains (e.g., when moving from synthetic to real scenes). In this paper we introduce a novel family of architectures, namely Matching-Space Networks (MS-Nets), with improved generalization properties. By replacing learning-based feature extraction from image RGB values with matching functions and confidence measures from conventional wisdom, we move the learning process from the color space to the Matching Space, avoiding over-specialization to domain specific features. Extensive experimental results on four real datasets highlight that our proposal leads to superior generalization to unseen environments over conventional deep architectures, keeping accuracy on the source domain almost unaltered. Our code is available at https://github.com/ ccj5351/MS-Nets.

show abstract

Unsupervised Cross-Spectral Stereo Matching by Learning to Synthesize

Cited by 29 publications

References 21 publications

Degradation-agnostic Correspondence from Resolution-asymmetric Stereo

Degradation-agnostic Correspondence from Resolution-asymmetric Stereo

There and Back Again: Self-supervised Multispectral Correspondence Estimation

Matching-space Stereo Networks for Cross-domain Generalization

Contact Info

Product

Resources

About