2016
DOI: 10.1007/978-3-319-46487-9_13
|View full text |Cite
|
Sign up to set email alerts
|

Deep Learning of Local RGB-D Patches for 3D Object Detection and 6D Pose Estimation

Abstract: We present a 3D object detection method that uses regressed descriptors of locally-sampled RGB-D patches for 6D vote casting. For regression, we employ a convolutional auto-encoder that has been trained on a large collection of random local patches. During testing, scene patch descriptors are matched against a database of synthetic model view patches and cast 6D object votes which are subsequently filtered to refined hypotheses. We evaluate on three datasets to show that our method generalizes well to previous… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
228
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
5
1
1

Relationship

2
5

Authors

Journals

citations
Cited by 265 publications
(228 citation statements)
references
References 31 publications
0
228
0
Order By: Relevance
“…The last method is to predict 3D locations of pixels or local shapes in the object space [2,16,23]. Brachmann et al [2] regress 3D coordinates and predict a class for each pixel using the auto-context random forest.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The last method is to predict 3D locations of pixels or local shapes in the object space [2,16,23]. Brachmann et al [2] regress 3D coordinates and predict a class for each pixel using the auto-context random forest.…”
Section: Related Workmentioning
confidence: 99%
“…A.1. Data augmentation for training2,3,4,13,14,15,16,17,18,24,30: sym = [I], the z-component of the rotation matrix is ignored. • Objects not in the list (non-symmetric): sym = [I] A.3.…”
mentioning
confidence: 99%
“…We first analyse the performance of the methods [36], [4], [2], [37] on the LINEMOD dataset. On the average, Kehl et al [36] outperforms other methods proving the superiority of learning deep features. Despite estimating 6D in RGB images, SSD-6D [37] exhibits the advantages of using CNN structures for 6D object pose estimation.…”
Section: A Analyses Based On Average Distancementioning
confidence: 99%
“…The studies in [28], [29] cope with texture-less objects. More recently, feature representations are learnt in an unsupervised fashion using deep convolutional (conv) networks (net) [35], [36]. While these methods fuse data coming from RGB and depth channels, a local belief propagation based approach [73] and an iterative refinement architecture [31], [32] are proposed in depth modality [74].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation