Abstract
The statistical regularities of natural images allow a visual system to interpolate missing or occluded image points with some accuracy. In previous work1, we obtained a tight (near-ideal) lower bound on how accurately missing pixel values in gray-scale natural images can be estimated. Here we measure how well the human visual system estimates gray-level at missing image locations and then compare performance against the near-ideal observer and several other models that incorporate natural image statistics and/or known low-level properties of the human visual system. On each trial of the experiment, one of 62 gray-scale natural image patches was presented to the observer. Psychometric functions were measured by varying the gray-level of the central pixel and asking the observer to respond whether the central pixel appeared too bright or too dark (pixel size = 4 arcmin). No feedback was given. The interpolated gray-level estimated by the observer was taken to be the midpoint of the psychometric function. For each natural image, two patch sizes were tested: a large size (128x128 pixels) that provided full spatial context, and a small size (5x5 pixels) that provided only the neighboring 24 pixels. Our results reveal that human performance is very similar for the 128x128 and 5x5 conditions, suggesting that human gray-level interpolation results predominantly from local image computations. Humans underperform the near-ideal observer, but outperform simple models that compute the average, median, or mode of surrounding pixels, suggesting that humans exploit some but not all of the local statistical structure of natural images. The best current model for human gray-scale interpolation performance is one that exploits the statistical structure of local contrast rather than local gray-scale and incorporates the psychophysical effects of contrast masking.
1Geisler & Perry (2011), Journal of Vision, (11)12:14 1-17.
Meeting abstract presented at VSS 2012