2021 IEEE Winter Conference on Applications of Computer Vision (WACV) 2021
DOI: 10.1109/wacv48630.2021.00295
|View full text |Cite
|
Sign up to set email alerts
|

CoMoDA: Continuous Monocular Depth Adaptation Using Past Experiences

Abstract: While ground truth depth data remains hard to obtain, self-supervised monocular depth estimation methods enjoy growing attention. Much research in this area aims at improving loss functions or network architectures. Most works, however, do not leverage self-supervision to its full potential. They stick to the standard closed world train-test pipeline, assuming the network parameters to be fixed after the training is finished. Such an assumption does not allow to adapt to new scenes, whereas with self-supervisi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
33
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 40 publications
(33 citation statements)
references
References 46 publications
0
33
0
Order By: Relevance
“…Baseline [50,13,15] ---PFT [49,5,24,27,34,20] --Ambrus et al [1] -D→E -ManyDepth [44] E→D -Nabavi et al [28] -D→E DRO [14] -D E Ours D→E Table 1: Summarizing the degree of coupling between depth and egomotion that are retained at inference time. Ours is the only method that uses all three forms of coupling.…”
Section: Methods Inference-time Couplingmentioning
confidence: 99%
See 2 more Smart Citations
“…Baseline [50,13,15] ---PFT [49,5,24,27,34,20] --Ambrus et al [1] -D→E -ManyDepth [44] E→D -Nabavi et al [28] -D→E DRO [14] -D E Ours D→E Table 1: Summarizing the degree of coupling between depth and egomotion that are retained at inference time. Ours is the only method that uses all three forms of coupling.…”
Section: Methods Inference-time Couplingmentioning
confidence: 99%
“…Building on this taxonomy, we present a novel network structure that ensures the depth and egomotion network predictions are tightly coupled at both training and inference time by incorporating all three coupling strategies. Our approach leverages two specific methods -namely, test-time optimization [49,5,44,24,27,34,20] for parameter fine-tuning (PFT), and iterative view synthesis [28] to recursively update the egomotion network input with the most recent synthesized view. Through extensive experiments, we demonstrate that our approach promotes consistency between the depth and egomotion predictions at test time, improves generalization on new data, and leads to state-of-the-art accuracy on indoor and outdoor depth and egomotion evaluation benchmarks.…”
Section: Methods Inference-time Couplingmentioning
confidence: 99%
See 1 more Smart Citation
“…Most CL approaches employ one of three strategies [18]: 1) experience replay including rehearsal and generative replay, 2) regularization, and 3) architectural approaches. Rehearsal refers to saving raw data samples of previous tasks, thereby handling the memory aspect, and re-using them during adaptation to new tasks, e.g., the replay buffer in CoMoDA [19]. To minimize the required size of the memory, the most representative samples should be carefully chosen or replaced by more abstract knowledge representations, e.g., flashcards [20].…”
Section: Related Workmentioning
confidence: 99%
“…Test-time refinement approaches adapt monocular methods to use sequence information at test time e.g. [5,9,59,62,72,51]. As self-supervised training does not require any ground truth depth supervision, the same losses used during training can be applied to the test frames to update the network's parameters.…”
Section: Multi-frame Monocular Depth Estimationmentioning
confidence: 99%