2018 IEEE International Conference on Robotics and Automation (ICRA) 2018
DOI: 10.1109/icra.2018.8460654
|View full text |Cite
|
Sign up to set email alerts
|

When Regression Meets Manifold Learning for Object Recognition and Pose Estimation

Abstract: In this work, we propose a method for object recognition and pose estimation from depth images using convolutional neural networks. Previous methods addressing this problem rely on manifold learning to learn low dimensional viewpoint descriptors and employ them in a nearest neighbor search on an estimated descriptor space. In comparison we create an efficient multi-task learning framework combining manifold descriptor learning and pose regression. By combining the strengths of manifold learning using triplet l… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
17
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 28 publications
(17 citation statements)
references
References 16 publications
0
17
0
Order By: Relevance
“…Surface normal is also used as an additional modality in [6]. Another way of using depth information is to treat it as an extra image depth channel (RGBD) and feed it into a CNN [5], [24], [6], [7], [8], or random forest [21], [1], [22] or a fully connected sparse autoencoder [25] for feature extraction. Depth is also used to create point clouds, which are used for generating pose hypotheses with 3D-3D correspondences and ICP refinement in [12].…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Surface normal is also used as an additional modality in [6]. Another way of using depth information is to treat it as an extra image depth channel (RGBD) and feed it into a CNN [5], [24], [6], [7], [8], or random forest [21], [1], [22] or a fully connected sparse autoencoder [25] for feature extraction. Depth is also used to create point clouds, which are used for generating pose hypotheses with 3D-3D correspondences and ICP refinement in [12].…”
Section: Related Workmentioning
confidence: 99%
“…[16] propose to regress translation and rotation with the same network. Quaternions are used as the rotation representation for regression [11], [16], [8]. Bui et al [8] propose to use L2 loss function for rotation learning.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, [15] proposes a descriptor for object templates, based on image and depth gradients. Deep Learning has also been applied to such approach, by learning to compute a descriptor from pairs or triplets of object images [34,1,36,5]. Like ours, these approaches do not require re-training, as it only requires to compute the descriptors for images of the new objects.…”
Section: D Object Detection and Pose Estimation From Color Imagesmentioning
confidence: 99%
“…With the success of deep learning in object recognition, deep neural network has been gradually applied to objects' 6D pose estimation. Multiple end-to-end CNN-based neural networks [10]- [13] have been proposed to map RGB images to 6D poses directly. Although end-to-end poses regression methods are simple, it is not clear whether such end-to-end algorithms have learned enough feature representations for pose estimation.…”
Section: Introductionmentioning
confidence: 99%