2016
DOI: 10.1007/978-3-319-49409-8_16
|View full text |Cite
|
Sign up to set email alerts
|

Overcoming Occlusion with Inverse Graphics

Abstract: Scene understanding tasks such as the prediction of object pose, shape, appearance and illumination are hampered by the occlusions often found in images. We propose a vision-as-inverse-graphics approach to handle these occlusions by making use of a graphics renderer in combination with a robust generative model (GM). Since searching over scene factors to obtain the best match for an image is very inefficient, we make use of a recognition model (RM) trained on synthetic data to initialize the search. This paper… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 28 publications
(19 citation statements)
references
References 30 publications
0
19
0
Order By: Relevance
“…Analysis-by-synthesis approaches to computer vision A long line of work has interpreted computer vision as the inverse problem to computer graphics [25,45,30,26]. This 'analysis-by-synthesis' approach has been used for various tasks including character recognition, CAPTCHA-breaking, lane detection, object pose estimation, and human pose estimation [46,41,31,34,21,35]. To our knowledge, our work is the first to use an analysis-by-synthesis approach to infer a hierarchical 3D object-based representation of real multi-object scenes while exploiting inductive biases about the contacts between objects.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Analysis-by-synthesis approaches to computer vision A long line of work has interpreted computer vision as the inverse problem to computer graphics [25,45,30,26]. This 'analysis-by-synthesis' approach has been used for various tasks including character recognition, CAPTCHA-breaking, lane detection, object pose estimation, and human pose estimation [46,41,31,34,21,35]. To our knowledge, our work is the first to use an analysis-by-synthesis approach to infer a hierarchical 3D object-based representation of real multi-object scenes while exploiting inductive biases about the contacts between objects.…”
Section: Related Workmentioning
confidence: 99%
“…where i indexes pixels of the depth image. Pixels whose ray does not intersect an object are assigned the maximum depth value D. A similar likelihood function on depth images was used in [34].…”
Section: Parsing Scenes With Fully Occluded Objects and Number Uncert...mentioning
confidence: 99%
“…Zienkiewicz et al [12] used render-and-compare for real-time height mapping fusion. Several recent works used render-and-compare for solving a wide range of vision problems: Tewari et al [13] learned unsupervised monocular face reconstruction; Kundu et al [14] introduced a framework for instance-level 3D scene understanding; Moreno et al [15] estimated 6D object pose in cluttered synthetic scenes. More closely related is the DeepIM method by Li et al [16], who formulated 6D object pose estimation as an iterative pose refinement process that refines the initial pose by trying to match the rendered image with the observed image.…”
Section: Related Workmentioning
confidence: 99%
“…It involves a considerably more complicated simulator that takes as input a set of 20 parameters and deterministically renders an image of a object (in this case, a teapot) on a uniform background. This is based on the generative model used by Moreno et al (2016). We focus on the task of learning the posterior distribution of two colour parameters in a setting where there are two possible explanations for the observed image and thus the posterior is expected to be bi-modal.…”
Section: Experiments 3: Gradient-free Romcmentioning
confidence: 99%
“…• Five illumination parameters that characterise the lighting on the object. Unlike Moreno et al (2016) who use spherical harmonics to model illumination, we use single-source directional lighting as it is more intuitive and natural.…”
Section: F Additional Information For Expmentioning
confidence: 99%