Recovering 3D scenes from 2D images is an under-constrained task; optimal estimation depends upon knowledge of the underlying scene statistics. Here we introduce the Southampton-York Natural Scenes dataset (SYNS: https://syns.soton.ac.uk), which provides comprehensive scene statistics useful for understanding biological vision and for improving machine vision systems. In order to capture the diversity of environments that humans encounter, scenes were surveyed at random locations within 25 indoor and outdoor categories. Each survey includes (i) spherical LiDAR range data (ii) high-dynamic range spherical imagery and (iii) a panorama of stereo image pairs. We envisage many uses for the dataset and present one example: an analysis of surface attitude statistics, conditioned on scene category and viewing elevation. Surface normals were estimated using a novel adaptive scale selection algorithm. Across categories, surface attitude below the horizon is dominated by the ground plane (0° tilt). Near the horizon, probability density is elevated at 90°/270° tilt due to vertical surfaces (trees, walls). Above the horizon, probability density is elevated near 0° slant due to overhead structure such as ceilings and leaf canopies. These structural regularities represent potentially useful prior assumptions for human and machine observers, and may predict human biases in perceived surface attitude.
Binocular stereopsis is a powerful visual depth cue. To exploit it, the brain matches features from the two eyes' views and measures their interocular disparity. This works well for matte surfaces because disparities indicate true surface locations. However, specular (glossy) surfaces are problematic because highlights and reflections are displaced from the true surface in depth, leading to information that conflicts with other cues to 3D shape. Here, we address the question of how the visual system identifies the disparity information created by specular reflections. One possibility is that the brain uses monocular cues to identify that a surface is specular and modifies its interpretation of the disparities accordingly. However, by characterizing the behavior of specular disparities we show that the disparity signals themselves provide key information ("intrinsic markers") that enable potentially misleading disparities to be identified and rejected. We presented participants with binocular views of specular objects and asked them to report perceived depths by adjusting probe dots. For simple surfaces-which do not exhibit intrinsic indicators that the disparities are "wrong"-participants incorrectly treat disparities at face value, leading to erroneous judgments. When surfaces are more complex we find the visual system also errs where the signals are reliable, but rejects and interpolates across areas with large vertical disparities and horizontal disparity gradients. This suggests a general mechanism in which the visual system assesses the origin and utility of sensory signals based on intrinsic markers of their reliability.psychophysics | perception | gloss | texture | computational analysis S hiny objects such as sports cars, jewelry, and consumer electronics can be beautiful to look at. However, such objects pose a difficult challenge to the visual system: if all (or most) of the light reaching the eye comes from the reflections of other nearby objects, how does the viewer discern the object itself? This problem becomes more acute when viewing with two eyes. Unlike shading or texture markings, the positions of reflections relative to a specular (shiny) surface depend on the observer's viewpoint. This means that when the surface is viewed binocularly (i.e., from two viewpoints at the same time), corresponding reflections fall on different surface locations. In consequence, the binocular disparities created by specular reflections indicate depth positions displaced from the object's physical surface (1, 2) and the 3D shape specified by disparity can be radically different from the true shape of the object. For special cases, such as an ideal planar mirror, the visual system could not, even in principle, estimate the true depths of the surface from the reflections. However, for more complex shapes, such as a polished metal kettle, we rarely encounter problems judging shape. Most models of biological vision place heavy weight on binocular disparity cues, whereas artificial systems often rely almost exclusively ...
There have been suggestions that human navigation may depend on representations that have no metric, Euclidean interpretation but that hypothesis remains contentious. An alternative is that observers build a consistent 3D representation of space. Using immersive virtual reality, we measured the ability of observers to point to targets in mazes that had zero, one or three 'wormholes' regions where the maze changed in configuration (invisibly). In one model, we allowed the configuration of the maze to vary to best explain the pointing data; in a second model we also allowed the local reference frame to be rotated through 90, 180 or 270 degrees. The latter model outperformed the former in the wormhole conditions, inconsistent with a Euclidean cognitive map.
Because specular reflection is view-dependent, shiny surfaces behave radically differently from matte, textured surfaces when viewed with two eyes. As a result, specular reflections pose substantial problems for binocular stereopsis. Here we use a combination of computer graphics and geometrical analysis to characterize the key respects in which specular stereo differs from standard stereo, to identify how and why the human visual system fails to reconstruct depths correctly from specular reflections. We describe rendering of stereoscopic images of specular surfaces in which the disparity information can be varied parametrically and independently of monocular appearance. Using the generated surfaces and images, we explain how stereo correspondence can be established with known and unknown surface geometry. We show that even with known geometry, stereo matching for specular surfaces is nontrivial because points in one eye may have zero, one, or multiple matches in the other eye. Matching features typically yield skew (nonintersecting) rays, leading to substantial ortho-epipolar components to the disparities, which makes deriving depth values from matches nontrivial. We suggest that the human visual system may base its depth estimates solely on the epipolar components of disparities while treating the ortho-epipolar components as a measure of the underlying reliability of the disparity signals. Reconstructing virtual surfaces according to these principles reveals that they are piece-wise smooth with very large discontinuities close to inflection points on the physical surface. Together, these distinctive characteristics lead to cues that the visual system could use to diagnose specular reflections from binocular information.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.