Purchase this article with an account.
Seha Kim, Johannes Burge; Human surface tilt estimation in natural and artificial 3D scenes. Journal of Vision 2017;17(10):404. doi: https://doi.org/10.1167/17.10.404.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
Estimating 3D surface orientation (slant and tilt) is an important task for sighted organisms. Previous studies have focused on artificial stimuli. Here, we study human surface tilt estimation with both artificial and natural stimuli. We obtained a large database of stereo-images of natural scenes providing rich image cues to surface orientation; precisely co-registered range data provided groundtruth tilt, slant, and distance at each pixel. We created a set of artificially-textured (plaids, pink noise) planar surfaces matched to the tilt, slant, distance, and contrast of the natural stimuli. We sampled natural and artificial stereo-image stimuli based on groundtruth tilt. 3600 stereo-patches were randomly selected; 150 for each groundtruth tilt. Human observers viewed natural and artificial surfaces through a small aperture (1deg). Observers indicated their tilt estimate with a mouse-controlled probe. Tilt estimation with natural and artificial stimuli differs markedly. Performance in natural scenes is much less accurate and less precise than in artificial scenes. Natural tilt estimates are strongly affected by a tilt prior whereas artificial tilt estimates are unaffected. These differences can largely be attributed to non-planar surface structure of natural scenes. Remarkably, despite these differences, the natural and artificial tilt estimates are equally good indicators of groundtruth tilt. Moreover, human performance is tightly predicted (including trial-by-trial errors) with zero free parameters by an ideal observer model for tilt estimation in natural scenes (Burge & Geisler, 2016). The ideal observer reports the Bayes-optimal minimum-mean-squared error (MMSE) tilt estimates given three local image cues (luminance, texture, disparity gradients). The strong similarities between human and ideal performance suggest that the human visual system is optimized to make optimal use of image information from local areas of natural scenes. These findings show that despite biases and overall imprecision, human 3D tilt estimation is a lawful perceptual process governed by priors and local measurements.
Meeting abstract presented at VSS 2017
This PDF is available to Subscribers Only