2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00278
|View full text |Cite
|
Sign up to set email alerts
|

Pix2Vox: Context-Aware 3D Reconstruction From Single and Multi-View Images

Abstract: Recovering the 3D representation of an object from single-view or multi-view RGB images by deep neural networks has attracted increasing attention in the past few years. Several mainstream works (e.g., 3D-R2N2) use recurrent neural networks (RNNs) to fuse multiple feature maps extracted from input images sequentially. However, when given the same set of input images with different orders, RNN-based approaches are unable to produce consistent reconstruction results. Moreover, due to long-term memory loss, RNNs … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
175
0
2

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 296 publications
(178 citation statements)
references
References 27 publications
1
175
0
2
Order By: Relevance
“…In order to verify the effectiveness and advantages of the proposed 3DMGNet, we compare 3DMGNet with state-of-the-art methods, i.e., 3D-R2-N2 [ 23 ], OGN [ 36 ], DRC [ 37 ], Pix2Vox-F [ 38 ], Pix2Vox++/F [ 39 ]. In order to unify the comparison conditions, we follow the same experiment settings as in PixelVox [ 38 ], and compare the IOU results when the threshold is 0.3. The comparison results are shown in Table 4 .…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…In order to verify the effectiveness and advantages of the proposed 3DMGNet, we compare 3DMGNet with state-of-the-art methods, i.e., 3D-R2-N2 [ 23 ], OGN [ 36 ], DRC [ 37 ], Pix2Vox-F [ 38 ], Pix2Vox++/F [ 39 ]. In order to unify the comparison conditions, we follow the same experiment settings as in PixelVox [ 38 ], and compare the IOU results when the threshold is 0.3. The comparison results are shown in Table 4 .…”
Section: Resultsmentioning
confidence: 99%
“…Compared with 3D-R2-N2 [ 23 ], OGN [ 36 ] and DRC [ 37 ], 3DMGNet can achieve the best generation results, which obtains the highest IOU value, except for the category of Watercraft. The 3DMGNet can achieve better generation accuracy than PixVox-F [ 38 ] and PixVox++/F [ 39 ] in most categories, i.e., Airplane, Bench, Chair, Display, Lamp, Rifle, Sofa and Telephone. Pix2Vox-F and PixVox++/F solves the single-view-based 3D model generation problem by spatial mapping, while the 3DMGNet solves this problem from the perspective of multi-modal feature fusion.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…This can be done in a straightforward manner if the camera parameters are known. Otherwise, the fusion can be done using point cloud registration techniques [80], [81] or fusion networks [82]. Also, point set representations require fixing in advance the number of points N while in methods that use grid representations, the number of points can vary based on the nature of the object but it is always bounded by the grid resolution.…”
Section: Representationsmentioning
confidence: 99%
“…Also, when given the same set of images with different orders, RNNs are unable to estimate the 3D shape of an object consistently due to permutation variance. To overcome these limitations, Xie et al [82] introduced Pix2Vox, which is composed of multiple encoder-decoder blocks, running in parallel, each one predicts a coarse volumetric grid from its input frame. This eliminates the effect of the order of input images and accelerates the computation.…”
Section: Exploiting Spatio-temporal Correlationsmentioning
confidence: 99%