Point cloud registration is to estimate a transformation to align point clouds collected in different perspectives. In learning-based point cloud registration, a robust descriptor is vital for high-accuracy registration. However, most methods are susceptible to noise and have poor generalization ability on unseen datasets. Motivated by this, we introduce SphereNet to learn a noise-robust and unseen-general descriptor for point cloud registration. In our method, first, the spheroid generator builds a geometric domain based on spherical voxelization to encode initial features. Then, the spherical interpolation of the sphere is introduced to realize robustness against noise. Finally, a new spherical convolutional neural network with spherical integrity padding completes the extraction of descriptors, which reduces the loss of features and fully captures the geometric features. To evaluate our methods, a new benchmark 3DMatchnoise with strong noise is introduced. Extensive experiments are carried out on both indoor and outdoor datasets. Under high-intensity noise, SphereNet increases the feature matching recall by more than 25 percentage points on 3DMatch-noise. In addition, it sets a new state-of-the-art performance for the 3DMatch and 3DLoMatch benchmarks with 93.5% and 75.6% registration recall and also has the best generalization ability on unseen datasets.
Recent advancements in deep learning opened new opportunities for learning a high-quality 3D model from a single 2D image given sufficient training on large-scale data sets. However, the significant imbalance between available amount of images and 3D models, and the limited availability of labeled 2D image data (i.e. manually annotated pairs between images and their corresponding 3D models), severely impacts the training of most supervised deep learning methods in practice. In this paper, driven by a novel design of adversarial networks, we have developed an unsupervised learning paradigm to reconstruct 3D models from a single 2D image, which is free of manually annotated pairwise input image and its associated 3D model. Particularly, the paradigm begins with training an adaption network via autoencoder with adversarial loss, which embeds unpaired 2D synthesized image domain with real world image domain to a shared latent vector space. Then, we jointly train a 3D deconvolutional network to transform the latent vector space to the 3D object space together with the embedding process. Our experiments verify our network's robust and superior performance in handling 3D volumetric object generation from a single 2D image.Existing works on 3D object reconstruction from 2D image(s) can be broadly categorized as two of the following: traditional methods without learning; deep learning based methods. 3D reconstruction without learning. The majority of traditional reconstruction methods based on SFM or SLAM [1,2] are subject to a dense number of views, and most of them rely on the hypothesis that features can be matched across views. 2D to 3D reconstruction models such as multi-view stereo [9, 10], space carving [11], multiple moving object and large scale structure from motion [3][4][5], have all demonstrated good performance in solving the 2D to 3D reconstruction problem. However these methods require high calibrated cameras and segmentation of objects from their background, which are less applicable in practice. Deep Neural Networks in 3D visual computing. Nowadays, by generating 3D volumetric data [12], prominent deep learning models such as the deep 2D convolutional neural networks can be naturally extended to learn 3D objects. Deep learning models have proven to have strong capabilities in learning latent representative vector space of 3D objects [12]. Multi-View CNN, Conv-DAE, Voxnet, Gift, T-L embedding, 3DGAN and so on, have uncovered great potential for solving retrieval, classification, 3D reconstruction problem, etc. on [13][14][15][16][17][18].In contrast to the vast amount of research and accomplishments in the field of 3D object classification and retrieval, there are fewer research and far less accomplished results on 3D object reconstruction. Recently, researchers began to utilize 3D deconvolutional neural network to generate 3D volumetric objects from 2D images, for instance, 3D- GAN[18] and T-L embedding [17] strive to learn a latent vector space representation of 2D images, and then transform it to gene...
In this paper, we propose a novel approach, 3D-RecGAN++, which reconstructs the complete 3D structure of a given object from a single arbitrary depth view using generative adversarial networks. Unlike existing work which typically requires multiple views of the same object or class labels to recover the full 3D geometry, the proposed 3D-RecGAN++ only takes the voxel grid representation of a depth view of the object as input, and is able to generate the complete 3D occupancy grid with a high resolution of $256^3$ by recovering the occluded/missing regions. The key idea is to combine the generative capabilities of 3D encoder-decoder and the conditional adversarial networks framework, to infer accurate and fine-grained 3D structures of objects in high-dimensional voxel space. Extensive experiments on large synthetic datasets and real-world Kinect datasets show that the proposed 3D-RecGAN++ significantly outperforms the state of the art in single view 3D object reconstruction, and is able to reconstruct unseen types of objects.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.