Abstract
“Visual difference predictors” (VDPs) are models which attempt to predict the extent to which people judge two natural scene images to be different from each other. A VDP generally samples scenes with arrays of localized, spatial-frequency and orientation-selective filters; where filters detect significant contrast differences a larger overall difference is assigned. Our VDP predicts differences based on differences within luminance, red-green and blue-yellow channels.
We now examine VDPs for supra-threshold differences. Do observer ratings of some kinds of image differences systematically deviate from the model, perhaps when the differences are considered “unimportant” (e.g. cast shadows, texture element displacement)?
We devised a large set of stimuli from digitized color photographs of many classes of scene; the supra-threshold differences between pairs could be natural (e.g. object moves) or computer-generated (e.g. blurred or desaturated). A subset originated from a single reference scene (e.g. cow grazing in field) which was then transformed in many ways (e.g. cow moves/changes size; background moves; lighting changes). Images pairs in this subset were chosen so that they were equally discriminable to the low-level model. The perceptual differences between image pairs were determined with subjective magnitude ratings.
The supra-threshold predictions of our low-level VDP model are good for most image pairs, but with consistent failures for some kinds of image differences (e.g. in the periphery of the images). Instances where the VDP model over-predicts differences may reveal that differences were available to low-level vision, but they were discarded somewhere between the low-level processing and the observer's response.