This paper presents a computational model to recover the most likely
interpretation of the 3D scene structure from a planar image, where some
objects may occlude others. The estimated scene interpretation is obtained by
integrating some global and local cues and provides both the complete
disoccluded objects that form the scene and their ordering according to depth.
Our method first computes several distal scenes which are compatible with the
proximal planar image. To compute these different hypothesized scenes, we
propose a perceptually inspired object disocclusion method, which works by
minimizing the Euler's elastica as well as by incorporating the relatability of
partially occluded contours and the convexity of the disoccluded objects. Then,
to estimate the preferred scene we rely on a Bayesian model and define
probabilities taking into account the global complexity of the objects in the
hypothesized scenes as well as the effort of bringing these objects in their
relative position in the planar image, which is also measured by an Euler's
elastica-based quantity. The model is illustrated with numerical experiments
on, both, synthetic and real images showing the ability of our model to
reconstruct the occluded objects and the preferred perceptual order among them.
We also present results on images of the Berkeley dataset with provided
figure-ground ground-truth labeling