Abstract
Estimating 3D surface orientation (i.e. slant and tilt) is an important first step toward estimating 3D shape. To understand how different image cues are combined for surface orientation estimation, we examine human estimation of tilt in stereo-images of natural scenes. First, we built a custom display system with a stereo-enabled Vpixx ProPixx projector and a Harkness Clarus 140 polarization-maintaining screen. Next, we obtained a large database of stereo-images of natural scenes with precisely co-registered range data. These stereo-images provide rich cues that influence surface orientation perception; the range data provides the ground-truth tilt, slant, and distance at each pixel. We binned the range data according to local tilt, slant, and distance and randomly sampled corresponding 1deg image patches within each bin. Then, we assessed human tilt estimation using these patches. Human observers sat 3m from the screen such that the left and right retinal images were identical to the images that would have been formed by the original scene. On each trial, observers binocularly viewed a patch of scene through a 1deg aperture. The task was to estimate tilt of the depicted surface using a mouse-controlled probe. We compared human estimates of tilt to ground-truth tilt computed directly from the range data. A rich set of results emerged. First, human tilt estimation was generally accurate but was biased towards the cardinal tilts (i.e. 0, 90, 180deg: tilts of surfaces slanted about vertical and horizontal axes). Second, tilt estimation error varied systematically with ground-truth tilt: errors at the cardinal tilts were lower than at other tilts. Third, the pattern of human biases and errors matched the performance of a previously developed ideal observer for 3D surface tilt estimation in natural scenes (Burge & Geisler, 2015). Thus, our preliminary findings suggest human observers may optimally process local cues to 3D surface orientation.
Meeting abstract presented at VSS 2016