PWOC-3D: Deep Occlusion-Aware End-to-End Scene Flow Estimation

Saxena, Rohan; Schuster, René; Wasenmüller, Oliver; Stricker, Didier

doi:10.1109/ivs.2019.8814146

Cited by 39 publications

(48 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Specifically, the full search space of such operation is 3D, being it over pixel coordinates plus displacement between correlation curves. We refer the reader to the supplementary material for some examples supporting this rationale, while ablation studies reported among our experiments prove the effectiveness of such a new layer, allowing DWARF to outperform similar architectures (Saxena et al 2019) on the KITTI online benchmark.…”

Section: Cost Volumes and 3d Correlation Layermentioning

confidence: 56%

See 1 more Smart Citation

Learning End-to-End Scene Flow by Distilling Single Tasks Knowledge

Aleotti

Poggi

Tosi

et al. 2020

AAAI

View full text Add to dashboard Cite

Scene flow is a challenging task aimed at jointly estimating the 3D structure and motion of the sensed environment. Although deep learning solutions achieve outstanding performance in terms of accuracy, these approaches divide the whole problem into standalone tasks (stereo and optical flow) addressing them with independent networks. Such a strategy dramatically increases the complexity of the training procedure and requires power-hungry GPUs to infer scene flow barely at 1 FPS. Conversely, we propose DWARF, a novel and lightweight architecture able to infer full scene flow jointly reasoning about depth and optical flow easily and elegantly trainable end-to-end from scratch. Moreover, since ground truth images for full scene flow are scarce, we propose to leverage on the knowledge learned by networks specialized in stereo or flow, for which much more data are available, to distill proxy annotations. Exhaustive experiments show that i) DWARF runs at about 10 FPS on a single high-end GPU and about 1 FPS on NVIDIA Jetson TX2 embedded at KITTI resolution, with moderate drop in accuracy compared to 10× deeper models, ii) learning from many distilled samples is more effective than from the few, annotated ones available.

show abstract

Section: Cost Volumes and 3d Correlation Layermentioning

confidence: 56%

“…Recently, Saxena et al (2019) proposed a fast and lightweight model taking into account occlusions. Although similar in design, we will show that DWARF outperforms it by a good margin.…”

Section: Related Workmentioning

confidence: 99%

Learning End-to-End Scene Flow by Distilling Single Tasks Knowledge

Aleotti

Poggi

Tosi

et al. 2020

AAAI

View full text Add to dashboard Cite

show abstract

“…In recent years, the CNN-based scene flow methods have shown the good performance on both computational accuracy and efficiency [57]- [59]. However, most of these CNN-based usually approaches require supervised training process and may have difficulty to be directly applied to real world data where ground truth is not easily accessible.…”

Section: Related Workmentioning

confidence: 99%

SS-SF: Piecewise 3D Scene Flow Estimation With Semantic Segmentation

Feng

Zhang

et al. 2021

IEEE Access

View full text Add to dashboard Cite

In order to address the issue of edge-blurring and improve the accuracy and robustness of scene flow estimation under motion occlusions, we in this paper propose a piecewise 3D scene flow estimation approach with semantic segmentation, named SS-SF. First, we utilize the semantic optical flow to initialize the 3D plane and its rigid motion parameters, and then produce the initial mappings of pixel-tosegment and segment-to-plane of the input left and right image sequences. Second, we plan a novel energy function to optimize the initial mappings by using a semantic segmentation constraint term to regularize the classical scene flow model, which the optimized mappings are employed to update the assignment and motion parameters of each pixel. Third, we adopt the semantic label to extract the occlusion pixels and exploit an occlusion handling constraint to enhance the robustness of the scene flow estimation. Finally, we compare the proposed SS-SF model with several state-of-the-art approaches by using the KITTI and MPI-Sintel databases. The experimental results demonstrate that the proposed method has the advanced accuracy and robustness in scene flow estimation, especially owns the capacities of edge-preserving and occlusion handling.

show abstract

“…A rigid plane model performs poorly when applied to deformable objects, and ego-motion estimation for highly dynamic scenes is hard. (Menze and Geiger, 2015) CSF (Lv et al, 2016) Deep Learning Fast Poor generalization PWOC-3D (Saxena et al, 2019) DRISF (Ma et al, 2019) Sparse-to-Dense Comparatively fast, good generalization Sensitive to distribution of matches SFF Schuster et al (2018c) SFF++ (ours)…”

Section: Related Workmentioning

confidence: 99%

“…Methods which are guided by semantic segmentation from deep neural networks will generalize badly to other domains, unless they are fine-tuned for the new task. Same is assumed for upcoming purely learning based approaches (Ma et al, 2019;Saxena et al, 2019) which are potentially even faster than our approach. SFF++ focuses especially on robustness across domains and applications.…”

Section: Related Workmentioning

confidence: 99%

SceneFlowFields++: Multi-frame Matching, Visibility Prediction, and Robust Interpolation for Scene Flow Estimation

et al. 2019

Self Cite

View full text Add to dashboard Cite

State-of-the-art scene flow algorithms pursue the conflicting targets of accuracy, run time, and robustness. With the successful concept of pixel-wise matching and sparse-to-dense interpolation, we shift the operating point in this field of conflicts towards universality and speed. Avoiding strong assumptions on the domain or the problem yields a more robust algorithm. This algorithm is fast because we avoid explicit regularization during matching, which allows an efficient computation. Using image information from multiple time steps and explicit visibility prediction based on previous results, we achieve competitive performances on different data sets. Our contributions and results are evaluated in comparative experiments. Overall, we present an accurate scene flow algorithm that is faster and more generic than any individual benchmark leader.

show abstract

PWOC-3D: Deep Occlusion-Aware End-to-End Scene Flow Estimation

Cited by 39 publications

References 31 publications

Learning End-to-End Scene Flow by Distilling Single Tasks Knowledge

Learning End-to-End Scene Flow by Distilling Single Tasks Knowledge

SS-SF: Piecewise 3D Scene Flow Estimation With Semantic Segmentation

SceneFlowFields++: Multi-frame Matching, Visibility Prediction, and Robust Interpolation for Scene Flow Estimation

Contact Info

Product

Resources

About