Purchase this article with an account.
Emily A. Cooper, Anthony M. Norcia; Perceived depth in natural images reflects encoding of low-level depth statistics. Journal of Vision 2014;14(10):1112. doi: https://doi.org/10.1167/14.10.1112.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
Seeing in 3D is typically understood as relying on a patchwork of visual depth cues. Accessing these cues requires computations that have challenged computer-vision algorithms and could only feasibly be performed by late-stage neural integration mechanisms. However, statistical analyses of natural scenes have revealed low-level luminance patterns that are predictive of distances, and that could be accessed by early-stage visual mechanisms with a low computational cost (e.g., Potetz & Lee, 2003; Su, Cormack, & Bovik, 2013). Optimal-coding models predict that the visual system should allocate its computational resources to exploit these patterns, and that this allocation should affect perceptional judgments. For example, darker points tend to be farther away than brighter points in natural scenes. This pattern is reflected in V1 cell tunings (Samonds, Potetz, & Lee, 2012). In the current work, we tested the model prediction that perceptual judgments will also be affected by this pattern. We asked if scenes conforming better to a "darker is farther" pattern are perceived as more 3D. We developed an image-processing algorithm that smoothly modulates luminance-depth patterns to make an individual image more or less consistent with natural scene statistics. This algorithm was applied to a set of photographs of natural and man-made scenes. Participants (n = 20) judged which version of a scene appeared more 3D. We compared the scene-statistics manipulation to a classic depth cue (binocular disparity). The results show that perceived depth agrees with an optimal coding prediction: versions of scenes with exaggerated luminance-depth patterns were seen as more 3D. The increase in positive 3D judgments caused by manipulating the scene statistics was ~25% as large as the increase caused by adding binocular disparity. We propose a model of population encoding in early visual cortex that could feasibly feed relevant depth-from-luminance information forward to higher visual areas.
Meeting abstract presented at VSS 2014
This PDF is available to Subscribers Only