Abstract
It is widely acknowledged that the visual system summarizes complex scenes to extract meaningful features (see for example Barlow, 1959; Marr 1976; 1992). This primal sketch is based on primitives like edges, bars, for which specific neural mechanisms have been found (Hubel & Wiesel, 1962;1977; Maffei et al. 1979; Kulikowsky & Bishop 1983) and many computational models have been proposed (Marr & Hildreth, 1980; Watt & Morgan 1985; Morrone & Burr 1988). Several studies also suggest that neurons encode sensory input in an information-efficient way (Barlow, 1972; Atick, 1992), using a small number of active neurons at any given point in time (‘sparse coding’ ) (Olshausen & Field 1996;2004). In this work we apply a novel pattern recognition model, derived from a principle of most efficient information coding within given computational limitations (Punzi & Del Viva VSS-2006). Using sets of natural and artificial images, we show that this model, in spite of very few free parameters, processes images in a way that is strikingly similar to the human system, identifying edges, lines, and textural elements, and predicts a structure of visual filters closely resembling well-known receptive fields. To evaluate the biological plausibility of the approach we compared the model performance to that of human observers, tested with psychophysical techniques. These results lead us to argue that real-world limitations to an information processing system can do much more than simply limit its performance: they actually act as a strong constraint in defining what the system categorizes as relevant features in the input, that is, what the system ultimately perceives as meaningful.