Abstract
The visual system faces the problem of extracting biologically-relevant information from a large amount of input data. It has been proposed long ago that complex scenes are initially summarized by extracting only a small amount of meaningful features (“primal sketch”) (Barlow, 1959; Marr 1976) The present study follows a pattern-filtering model, based on the principle of efficient information coding under real-world limitations (Punzi & Del Viva VSS-2006). The model predicts from very general principles the structure of efficient information-coding patterns, that provide highly compressed “sketches” of visual scenes (Del Viva & Punzi VSS-2008), containing only a limited number of salient features. The aim of this study is to directly test whether the “salient features” extracted by this computational model actually correspond to the visual features that human subjects use to discriminate between natural images. We performed a psychophysic experiment by briefly presenting (20 ms) hybrid images composed by mixing points sampled from the “salient features” (according to our algorithm) of one image, with points extracted from another (background) image using a different algorithm. The subject's task was then presented with the two original images and asked to judge which of the two more closely resembled the hybrid, using an AFC procedure. All subjects indicated with high probability (>80%) the image that had been sampled based on the salient features predicted by our algorithm as the better match. This performance remained unchanged when a random unrelated image was proposed as a possible match in place of the image actually used as background, showing that the background match events are mainly of random nature. These results support the reliability of our model in predicting the salient features that human subjects uniquely use to identify visual scenes under very fast viewing conditions.