An Octree-Based Approach towards Efficient Variational Range Data Fusion

Kehl, Wadim; Holl, Tobias; Tombari, Federico; Ilić, Slobodan; Navab, Nassir

doi:10.5244/c.30.21

Cited by 6 publications

(6 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One of the most popular space partitioning structures on voxel grids are octrees [31] which have been widely adopted due to their flexible and hierarchical structure. Areas of application include depth fusion [24], image rendering [27] and 3D reconstruction [45]. In this paper, we propose 3D convolutional networks on octrees to learn representations from high resolution 3D data.…”

Section: Octree Networkmentioning

confidence: 99%

See 1 more Smart Citation

OctNet: Learning Deep 3D Representations at High Resolutions

Riegler

Ulusoy

Geiger

2017

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

1,486

932

View full text Add to dashboard Cite

We present OctNet, a representation for deep learning with sparse 3D data. In contrast to existing models, our representation enables 3D convolutional networks which are both deep and high resolution. Towards this goal, we exploit the sparsity in the input data to hierarchically partition the space using a set of unbalanced octrees where each leaf node stores a pooled feature representation. This allows to focus memory allocation and computation to the relevant dense regions and enables deeper networks without compromising resolution. We demonstrate the utility of our OctNet representation by analyzing the impact of resolution on several 3D tasks including 3D object classification, orientation estimation and point cloud labeling. arXiv:1611.05009v4 [cs.CV] 10 Apr 2017 naïvely. We illustrate this in Fig. 1 for a 3D classification example. Given the 3D meshes of [48] we voxelize the input at a resolution of 64 3 and train a simple 3D convolutional network to minimize a classification loss. We depict the maximum of the responses across all feature maps at different layers of the network. It is easy to observe that high activations occur only near the object boundaries.Motivated by this observation, we propose OctNet, a 3D convolutional network that exploits this sparsity property. Our OctNet hierarchically partitions the 3D space into a set of unbalanced octrees [32]. Each octree splits the 3D space according to the density of the data. More specifically, we recursively split octree nodes that contain a data point in its domain, i.e., 3D points, or mesh triangles, stopping at the finest resolution of the tree. Therefore, leaf nodes vary in size, e.g., an empty leaf node may comprise up to 8 3 = 512 voxels for a tree of depth 3 and each leaf node in the octree stores a pooled summary of all feature activations of the voxel it comprises. The convolutional network operations are directly defined on the structure of these trees. Therefore, our network dynamically focuses computational and memory resources, depending on the 3D structure of the input. This leads to a significant reduction in computational and memory requirements which allows for deep learning at high resolutions. Importantly, we also show how essential network operations (convolution, pooling or unpooling) can be efficiently implemented on this new data structure.We demonstrate the utility of the proposed OctNet on three different problems involving three-dimensional data: 3D classification, 3D orientation estimation of unknown object instances and semantic segmentation of 3D point clouds. In particular, we show that the proposed OctNet enables significant higher input resolutions compared to dense inputs due to its lower memory consumption, while achieving identical performance compared to the equivalent dense network at lower resolutions. At the same time we gain significant speed-ups at resolutions of 128 3 and above. Using our OctNet, we investigate the impact of high resolution inputs wrt. accuracy on the three tasks and demonstrate that higher resol...

show abstract

Section: Octree Networkmentioning

confidence: 99%

“…Network Architectures ModelNet10 Classification: OctNet1 24) conv(16, 24) conv(16, 24) conv(16, 24) conv(24, 24) conv(24, 24) conv(24, 24) conv(24,24) …”

mentioning

confidence: 99%

OctNet: Learning Deep 3D Representations at High Resolutions

Riegler

Ulusoy

Geiger

2017

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

1,486

932

View full text Add to dashboard Cite

show abstract

“…To increase the geometric resolution, Laine et al [24] proposed a novel compression technique where the voxel data are augmented for the sparse octree to give smooth surfaces and greater geometric detail. Variational range data are fused by proposing dynamic octree partition for volume-based object reconstruction in [61]. A similar approach was proposed by Tatarchenko et al [22]; an octree generating deep convolutional decoder was proposed to reconstruct highresolution 3D shapes.…”

Section: Related Workmentioning

confidence: 99%

“…Although the computational power of GPU-based CNNs has significantly improved recently, in contrast, the training time and computational issues are the main constraints of voxel-based 3D volumetric data that limit the use of high-volume resolutions and going for deeper networks. e octree-based volumetric representation started gaining popularity by researchers because it reduced computational overhead [21][22][23][24]. However, octree representation performs better to preserve the fine details of the 3D object and the smoothness of the 3D object's surface in comparison to voxel representation.…”

Section: Introductionmentioning

confidence: 99%

A New Volumetric CNN for 3D Object Classification Based on Joint Multiscale Feature and Subvolume Supervised Learning Approaches

Muzahid

Wan

Hou

2020

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

The advancement of low-cost RGB-D and LiDAR three-dimensional (3D) sensors has permitted the obtainment of the 3D model easier in real-time. However, making intricate 3D features is crucial for the advancement of 3D object classifications. The existing volumetric voxel-based CNN approaches have achieved remarkable progress, but they generate huge computational overhead that limits the extraction of global features at higher resolutions of 3D objects. In this paper, a low-cost 3D volumetric deep convolutional neural network is proposed for 3D object classification based on joint multiscale hierarchical and subvolume supervised learning strategies. Our proposed deep neural network inputs 3D data, which are preprocessed by implementing memory-efficient octree representation, and we propose to limit the full layer octree depth to a certain level based on the predefined input volume resolution for storing high-precision contour features. Multiscale features are concatenated from multilevel octree depths inside the network, aiming to adaptively generate high-level global features. The strategy of the subvolume supervision approach is to train the network on subparts of the 3D object in order to learn local features. Our framework has been evaluated with two publicly available 3D repositories. Experimental results demonstrate the effectiveness of our proposed method where the classification accuracy is improved in comparison to existing volumetric approaches, and the memory consumption ratio and run-time are significantly reduced.

show abstract

“…To avoid the limitation of voxel representations, deep learning on 3D sparse data using octree structured 3D data provides very promising results for 3D shape analysis [20], [21]. Riegler at el.…”

Section: A Deep Neural Network For 3d Object Classificationmentioning

confidence: 99%

3D Object Classification Using a Volumetric Deep Neural Network: An Efficient Octree Guided Auxiliary Learning Approach

et al. 2020

View full text Add to dashboard Cite

We consider the recent challenges of 3D shape analysis based on a volumetric CNN that requires a huge computational power. This high-cost approach forces to reduce the volume resolutions when applying 3D CNN on volumetric data. In this context, we propose a multiorientation volumetric deep neural network (MV-DNN) for 3D object classification with octree generating low-cost volumetric features. In comparison to conventional octree representations, we propose to limit the octree partition to a certain depth to reserve all leaf octants with sparsity features. This allows for improved learning of complex 3D features and increased prediction of object labels at both low and high resolutions. Our auxiliary learning approach predicts object classes based on the subvolume parts of a 3D object that improve the classification accuracy compared to other existing 3D volumetric CNN methods. In addition, the influence of views and depths of the 3D model on the classification performance is investigated through extensive experiments applied to the ModelNet40 database. Our deep learning framework runs significantly faster and consumes less memory than full voxel representations and demonstrate the effectiveness of our octree-based auxiliary learning approach for exploring high resolution 3D models. Experimental results reveal the superiority of our MV-DNN that achieves better classification accuracy compared to state-of-art methods on two public databases. INDEX TERMS 3D shape analysis, object classification, convolutional neural network, DNNs, volumetric CNN. A. A. M. MUZAHID received the M.E. degree in communication and information engineering from the Chongqing University of Posts and Telecommunications, Chongqing, China, in 2016, and the B.Sc. degree in electronics and telecommunications engineering from Daffodil International University, Dhaka, Bangladesh, in 2011. He is currently pursuing the Ph.D. degree in communication and information systems with Shanghai University, Shanghai, China. He is a Research Member of the Institute of Smart City, Shanghai University, and leading the ''Computer Vision and 3D Virtual Reality'' research team. His current research interests include 3D Shape analysis, computer vision, and 3D virtual reality.

show abstract

An Octree-Based Approach towards Efficient Variational Range Data Fusion

Cited by 6 publications

References 22 publications

OctNet: Learning Deep 3D Representations at High Resolutions

OctNet: Learning Deep 3D Representations at High Resolutions

A New Volumetric CNN for 3D Object Classification Based on Joint Multiscale Feature and Subvolume Supervised Learning Approaches

3D Object Classification Using a Volumetric Deep Neural Network: An Efficient Octree Guided Auxiliary Learning Approach

Contact Info

Product

Resources

About