Monocular Differentiable Rendering for Self-supervised 3D Object Detection

Beker, Deniz; Kato, Hiroharu; Morariu, Mihai; Ando, Takahiro; Matsuoka, Toru; Kehl, Wadim; Gaidon, Adrien

doi:10.1007/978-3-030-58589-1_31

Cited by 37 publications

(22 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The current research methods for 3D target detection can be summarized as LIDAR [3][4] based detection methods, and depth image [5][6] based detection methods. The depth image detection methods are classified into monocular [7][8], binocular [9], and multiocular 3D target detection by designing 2D image feature extractors to capture pixels. In addition, some studies combined multiple heterogeneous feature map with a fusion mode view study approach [10] [11], which proved the superior performance of the fusion module by enhancing point-by-point features with semantic image features [12].…”

Section: Related Workmentioning

confidence: 99%

3D Point Cloud Target Detection Method Based on R-PointGNN

Wang

et al. 2022

Preprint

View full text Add to dashboard Cite

Point cloud target detection completes the interaction of 3D features such as vector position and reflection intensity in the coordinate system and 3D visualization with visual enhancement effects, which are widely used in virtual reality, augmented reality and autonomous driving. However, the disorder, sparsity and overlap of LiDAR point clouds increase the difficulty of point cloud recognition. To address this problem, a 3D point cloud target detection model named R-PointGNN (Residual-Graph Neural Network) based on graph neural network is proposed. Deep residual connections are constructed on the graph neural network structure. Firstly, the semantic graphs of the point clouds are constructed by the nearest neighbor algorithm; then, the feature transfer and state update of the graphs are completed by the improved edge convolution; finally, the residual connections are introduced to connect multiple layers of states and extract deep features. Experiments on the KITTI dataset show that the method performs well in the point cloud target detection task with a high degree of discrimination.

show abstract

Section: Related Workmentioning

confidence: 99%

3D Point Cloud Target Detection Method Based on R-PointGNN

Wang

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…[22] for a comprehensive overview. This spawned several applications like self-supervision for monocular 3D object detection [5]. In [16], a differentiable renderer is employed -and even learned -to predict geometric correspondence fields to refine pose estimates of 3D objects.…”

Section: Related Workmentioning

confidence: 99%

In-Situ Joint Light and Medium Estimation for Underwater Color Restoration

Nakath

She

Song

et al. 2021

2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)

View full text Add to dashboard Cite

The majority of Earth's surface is situated in the deep sea and thus remains deprived of natural light. Such adverse underwater environments have to be explored with powerful camera-light systems. In order to restore the colors in images taken by such systems, we need to jointly estimate physically-meaningful optical parameters of the light as well as the water column. We thus propose an integrated in-situ estimation approach and a complementary surface texture recovery strategy, which also removes shadows as a by-product. As we operate in a scattering medium under inhomogeneous lighting conditions, the volumetric effects are difficult to capture in closed-form solutions. Hence, we leverage the latest progress in Monte Carlo-based differentiable ray tracing that becomes tractable through recent GPU RTX-hardware acceleration. Evaluations on synthetic data and in a water tank show that we can estimate physically meaningful parameters, which enables color restoration. The approaches could also be employed to other camera-light systems (AUV, robot, car, endoscope) operating either in the dark, in fog -or -underwater.

show abstract

“…Recently, many works [2,20,24] concentrate on vehicle 3D texture recovery under real environments. Due to the lack of ground truth 3D data of real scene, they mainly pay attention to reconstruct 3D models from 2D data utilizing unsupervised or self-supervised learning.…”

Section: Monocular Vehicle Reconstructionmentioning

confidence: 99%

“…Monocular visual scene understanding, which mainly focuses on high-level understanding of single image content, is a fundamental technology for many automatic applications, especially in the field of autonomous driving. Using only a single-view driving image, available vehicle parsing studies have covered popular topics starting from 2D vehicle detection [3,34,32,14,31,46], then 6D vehicle pose recovery [59,55,35,26,52,12,13,4,33], and finally vehicle shape reconstruction [10,28,50,20,2,24,15,29,36,61]. However, much less efforts are devoted to vehicle texture estimation.…”

Section: Introductionmentioning

confidence: 99%

Vehicle Reconstruction and Texture Estimation Using Deep Implicit Semantic Template Mapping

Zhao,

Zheng,

et al. 2020

Preprint

View full text Add to dashboard Cite

We introduce VERTEX, an effective solution to recover 3D shape and intrinsic texture of vehicles from uncalibrated monocular input in real-world street environments. To fully utilize the template prior of vehicles, we propose a novel geometry and texture joint representation, based on implicit semantic template mapping. Compared to existing representations which infer 3D texture distribution, our method explicitly constrains the texture distribution on the 2D surface of the template as well as avoids limitations of fixed resolution and topology. Moreover, by fusing the global and local features together, our approach is capable to generate consistent and detailed texture in both visible and invisible areas. We also contribute a new synthetic dataset containing 830 elaborate textured car models labeled with sparse key points and rendered using Physically Based Rendering (PBRT) system with measured HDRI skymaps to obtain highly realistic images. Experiments demonstrate the superior performance of our approach on both testing dataset and in-the-wild images. Furthermore, the presented technique enables additional applications such as 3D vehicle texture transfer and material identification.

show abstract

Monocular Differentiable Rendering for Self-supervised 3D Object Detection

Cited by 37 publications

References 28 publications

3D Point Cloud Target Detection Method Based on R-PointGNN

3D Point Cloud Target Detection Method Based on R-PointGNN

In-Situ Joint Light and Medium Estimation for Underwater Color Restoration

Vehicle Reconstruction and Texture Estimation Using Deep Implicit Semantic Template Mapping

Contact Info

Product

Resources

About