Deep convolutional networks (DCNNs) are achieving previously unseen performance in object classification, raising questions about whether DCNNs operate similarly to human vision. In biological vision, shape is arguably the most important cue for recognition. We tested the role of shape information in DCNNs trained to recognize objects. In Experiment 1, we presented a trained DCNN with object silhouettes that preserved overall shape but were filled with surface texture taken from other objects. Shape cues appeared to play some role in the classification of artifacts, but little or none for animals. In Experiments 2–4, DCNNs showed no ability to classify glass figurines or outlines but correctly classified some silhouettes. Aspects of these results led us to hypothesize that DCNNs do not distinguish object’s bounding contours from other edges, and that DCNNs access some local shape features, but not global shape. In Experiment 5, we tested this hypothesis with displays that preserved local features but disrupted global shape, and vice versa. With disrupted global shape, which reduced human accuracy to 28%, DCNNs gave the same classification labels as with ordinary shapes. Conversely, local contour changes eliminated accurate DCNN classification but caused no difficulty for human observers. These results provide evidence that DCNNs have access to some local shape information in the form of local edge relations, but they have no access to global object shapes.
Perception of objects in ordinary scenes requires interpolation processes connecting visible areas across spatial gaps. Most research has focused on 2-D displays, and models have been based on 2-D, orientation-sensitive units. The authors present a view of interpolation processes as intrinsically 3-D and producing representations of contours and surfaces spanning all 3 spatial dimensions. The authors propose a theory of 3-D relatability that indicates for a given edge which orientations and positions of other edges in 3 dimensions may be connected to it, and they summarize the empirical evidence for 3-D relatability. The theory unifies and illuminates a number of fundamental issues in object formation, including the identity hypothesis in visual completion, the relations of contour and surface processes, and the separation of local and global processing. The authors suggest that 3-D interpolation and 3-D relatability have major implications for computational and neural models of object perception.
We report four experiments in which the strength of edge interpolation in illusory figure displays was tested. In Experiment 1, we investigated the relative contributions of the lengths of luminance-specified edges and the gaps between them to perceived boundary clarity as measured by using a magnitude estimation procedure. The contributions of these variables were found to be best characterized by a ratio of the length of luminance-specified contour to the length of the entire edge (specified plus interpolated edge). Experiment 2 showed that this ratio predicts boundary clarity for a wide range of ratio values and display sizes. There was no evidence that illusory figure boundaries are clearer in displays with small gaps than they are in displays with larger gaps and equivalent ratios. In Experiment 3, using a more sensitive pairwise comparison paradigm, we again found no such effect. Implications for boundary interpolation in general, including perception of partially occluded objects, are discussed. The dependence of interpolation on the ratio of physically specified edges to total edge length has the desirable ecological consequence that unit formation will not change with variations in viewing distance.In a world of discrete surfaces and objects, the distance between points will necessarily be related to the likelihood that those points are part of the same surface or object. Such a relationship would also hold for points in the projections of surfaces and objects. However, when spatial gaps are projected to observers, the projected distance between two points depends on viewing distance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.