2016 IEEE Winter Conference on Applications of Computer Vision (WACV) 2016
DOI: 10.1109/wacv.2016.7477655
|View full text |Cite
|
Sign up to set email alerts
|

Joint object recognition and pose estimation using a nonlinear view-invariant latent generative model

Abstract: Object recognition and pose estimation are two fundamental problems in the field of computer vision. Recognizing objects and their poses/viewpoints are critical components of ample vision and robotic systems. Multiple viewpoints of an object lie on an intrinsic low-dimensional manifold in the input space (i.e. descriptor space). Different objects captured from the same set of viewpoints have manifolds with a common topology. In this paper we utilize this common topology between object manifolds by learning a l… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
2
2
1

Relationship

1
4

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 24 publications
0
3
0
Order By: Relevance
“…There are two main lines of research that are relevant to our work: (i) 3D pose estimation given object category label and (ii) joint object category and pose estimation. There are many non-deep learning based approaches such as [12,13,14,15,16,17,18,19], which have designed systems to solve these two tasks. However, due to space constraints, we restrict our discussion to deep learning based approaches.…”
Section: Related Workmentioning
confidence: 99%
“…There are two main lines of research that are relevant to our work: (i) 3D pose estimation given object category label and (ii) joint object category and pose estimation. There are many non-deep learning based approaches such as [12,13,14,15,16,17,18,19], which have designed systems to solve these two tasks. However, due to space constraints, we restrict our discussion to deep learning based approaches.…”
Section: Related Workmentioning
confidence: 99%
“…The shared first five layers are able to build up the object-view manifolds, preserve them and enhance them in the pose subnetwork of the model, while the other subnetwork specializes in pose-invariant category recognition. Approach Category % Pose (AAAI) % (Lai et al, 2011b) 94.30 (RGB + Depth) 53.50 (Bakry & Elgammal, 2014) 94.84 (RGB only)/ 96.01 (RGB+ Depth) 76.01 (Zhang et al, 2013a) 92.00 (RGB only)/ 93.10 (RGB + Depth) 61.57 (Bakry et al, 2016) 85.00 77.31 Ours (EBM(800))…”
Section: Computational Analysis and Convergencementioning
confidence: 99%
“…In computer vision tasks, while deep learning models are known to be vulnerable to adversarial perturbations in pixel values (Szegedy et al 2014;Croce and Hein 2020;Yin, Ruan, and Fieldsend 2022;Mu et al 2021Mu et al , 2022, Engstrom et al (2019) show that a slight rotation of an input example can also fool DNNs. Although modern DNNs are believed to be able to learn geometric information from training data (Bakry et al 2016), they are not yet invariant to simple adversarial geometric transformations (Zhang et al 2020).…”
Section: Introductionmentioning
confidence: 99%