2019
DOI: 10.1609/aaai.v33i01.33018706
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Cross-Spectral Stereo Matching by Learning to Synthesize

Abstract: Unsupervised cross-spectral stereo matching aims at recovering disparity given cross-spectral image pairs without any depth or disparity supervision. The estimated depth provides additional information complementary to original images, which can be helpful for other vision tasks such as tracking, recognition and detection. However, there are large appearance variations between images from different spectral bands, which is a challenge for cross-spectral stereo matching. Existing deep unsupervised stereo matchi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 29 publications
(10 citation statements)
references
References 21 publications
0
10
0
Order By: Relevance
“…To estimate correspondence from stereo images with radiometric variation, different robust matching costs are proposed, such as mutual information measure [11] and adaptive normalized cross-correlation [12]. For cross-modal stereo [40,45,49], images from two different modalities are normalized to a single one to make up the photometric inconsistency, e.g., through deep transformation networks [21,49]. Recently, stereo matching with visual imbalance (monocular blur and noise) is addressed by integrating a view synthesis network and a stereo reconstruction network, which requires the ground-truth disparity, the high-quality version of the degraded view, and the explicit degradation form for supervision [25].…”
Section: Related Workmentioning
confidence: 99%
“…To estimate correspondence from stereo images with radiometric variation, different robust matching costs are proposed, such as mutual information measure [11] and adaptive normalized cross-correlation [12]. For cross-modal stereo [40,45,49], images from two different modalities are normalized to a single one to make up the photometric inconsistency, e.g., through deep transformation networks [21,49]. Recently, stereo matching with visual imbalance (monocular blur and noise) is addressed by integrating a view synthesis network and a stereo reconstruction network, which requires the ground-truth disparity, the high-quality version of the degraded view, and the explicit degradation form for supervision [25].…”
Section: Related Workmentioning
confidence: 99%
“…Generative approaches for RGB and NIR use a cycleGAN to match across generated stereo pairs for both spectra [42], [43]. Performance of these image-to-image translation approaches are subject to spectral similarity.…”
Section: B Machine Learning and Self-supervised Trainingmentioning
confidence: 99%
“…Both neural network approaches, from Zhi et al [31] and Liang et al [42] leverage image-to-image translation. This is possible because the majority of materials in the NIR spectrum closely resemble their grayscale counterparts in the visible image.…”
Section: B Rgb-nir Evaluationmentioning
confidence: 99%
“…Most deep stereo models are particularly data dependent and their performance drops considerably when dealing with unseen domains different from those observed during training [37,38]. To tackle the domain shift problem, two main strategies are involved: image synthesis [6,14,18], and un-/selfsupervised adaptation [3,10,25,37,38,39,40,48,49]. In contrast, our method aims at being transferred without adaptation to different domains, being this possibility more appealing for practical applications.…”
Section: Related Workmentioning
confidence: 99%