Abstract
A variety of measurements of the statistical properties of low level features at points of gaze in natural scenes have been published recently; measures such as luminance contrast, edge density, etc. at fixation loci are statistically different from those at randomly selected locations. In this study, we examined the disparity statistics of natural scenes at loci fixated by human observers. Surprisingly, we found that human fixations tend toward areas of relatively smooth scene depth. We computed dense disparity maps from 76 stereo images using a correspondence algorithm based on local correlation; the 48 image pairs yielding the lowest uncertainty in the disparity maps were used. The stereo pairs were presented on two calibrated monitors through a mirror stereoscope. The subjects were asked to free view each 20 × 15 deg. stereo image for 10 seconds during which eye movements were recorded, yielding a set of fixation maps paired with each disparity map. We then extracted patches from the disparity maps at the fixation loci from the corresponding fixation maps for analysis. For comparison, we ran a simulation that extracted the same number of patches uniformly distributed on each disparity map 100 times. By analyzing the different performances between human subjects and the 100 times random simulations, we found that disparity contrasts (standard deviation of the disparity map) and mean disparity gradients were generally lower at fixated patches than at randomly selected patches, and that this difference peaked at ∼ 1 deg. This suggests that human gaze tends to seek regions where scene depth is changing smoothly. One possible explanation is that the disparity computation is better posed away from depth discontinuities. Since large disparity contrasts and gradients are often coincident with large luminance contrasts and gradients, it will be necessary to reconcile these findings with those based on luminance.