Abstract
Most accounts of ‘Shape-from-X’ start with a computational theory of a particular cue, then outline methods for extracting relevant data from the image. Here we take the opposite approach, starting with image statistics and investigating how they might be exploited to estimate shape across variations in lighting, reflectance and texture. We rendered a large number objects and looked for image statistics that vary systematically with properties of the shape. We find several simple measurements — derived from filters at different orientations and scales — yield surprisingly reliable information about 3D shape. In a series of experiments we show that changes in these statistics predict certain successes and failures of human perception.
In a gauge probe task, shape perception remained remarkably constant across changes in surface reflectance (glossiness, albedo). Although the images differ substantially on a pixel-by-pixel basis, the orientation statistics remain stable across these reflectance changes, suggesting they could be the basis of human performance.
In another task, observers were presented with shaded objects that had been subjected to certain shape transformations. The task was to adjust the magnitude of shear or stretch of a textured object until it appeared to be the same shape as the shaded object. Subjects underestimate the shear transformation for shaded objects, and the scaling transformation for textured objects, consistent with the predictions derived from our image statistics analysis. Thus, differences between cues may be predicted by a common front end.
In another task, we applied transformations to texture and shading that elicit illusions of 3D shape. The strength of the illusions correlates with the induced changes in the orientation and scale statistics. Together, these findings suggest that to understand 3D shape perception, it is useful to reformulate the problem in terms of the image measurements made by the front end of vision.
RF supported by DFG grant 624/1-1.