Important in the first approach was the idea that the spatial structure of the receptive fields in the early visual system is well suited to optimize information representation in a sparse neural code that exploits spatial regularities in the natural environment and reduces energy consumption (Lennie,
2003). Starting from signal processing considerations several groups (Atick & Redlich,
1992; Field,
1987) found that receptive fields of simple and complex cells in primate V1 are well adapted to the approximate 1/f fall-off in the amplitude spectrum of natural scenes. However, later investigations showed that the 1/f amplitude fall-off in natural scenes cannot explain the development of localized receptive fields in the visual system (Bell & Sejnowski,
1997; Field,
1994; Olshausen & Field,
1996). Field (
1987,
1994) showed that nonlocalized receptive fields would be better suited to represent spatial information in a world with 1/f amplitude fall-off than localized receptive fields. This insight highlighted the importance of local phase alignments over multiple spatial scales of natural scenes for sparse coding. These phase alignments perceptually manifest as lines and edges in natural scenes and are thought to drive the development of localized Gabor-like receptive fields in models when a sparse coding constraint is imposed (Bell & Sejnowski,
1997; Field,
1994; Olshausen & Field,
1996). Field (
1994) argued that these nonrandom phase alignments in natural scenes produce higher spatial redundancy in natural scenes as compared to noise images with amplitude spectra that have the same shape as the intact images. Therefore, our goal was to keep amplitude spectra perfectly matched between phase randomization levels. In our approach, we randomized Fourier phases in photographs of natural scenes independently over spatial scales and orientations. In addition, we used different degrees of randomization to parametrically transform the highly redundant spatial statistics of our natural scene photographs to the low redundancy level of pink noise images. The success of this manipulation is indicated by the decrease of pixel kurtosis with increasing degrees of phase randomization while RMS contrast remains at the same level. Field (
1994) showed theoretically and multiple single neuron recordings (Baddeley et al.,
1997; Felsen et al.,
2005; Vinje & Gallant,
2000,
2002; Weliky et al.,
2003; Willmore & Tolhurst,
2001) showed experimentally that localized log Gabor-like receptive fields, similar to those of V1 neurons, translate high pixel redundancy in natural scenes in high redundancy in neuronal population activity. To provide further support for the biological plausibility of this theory, Olshausen and Field (
1996) derived a dictionary of basis function which closely resembled the receptive fields of neurons in V1 to approximate natural scene photographs with a sparse code. To obtain these basis functions the authors traded off the veridicality of natural scene representation with the sparseness of the neural code. In sum these studies suggest that V1 neuronal population activation is expected to be sparse with many weakly activated and few strongly activated neurons when intact natural scene photographs are presented as visual stimuli. With increasing phase randomization in our images, however, the neuronal population code in V1 would become less sparse. This behavior is expected for sparse coding models based on log Gabor-like basis functions that are optimal to veridically represent natural scenes with a sparse population code. The implementations of sparse coding discussed so far do not impose constraints on the veridicality of the image representation once the basis functions are determined. The veridical sparse representation of natural scenes is considered energy efficient with such basis functions (Baddeley et al.,
1997; Graham & Field,
2009; Hyvärinen et al.,
2009; Perna, Tosetti, Montanaro, & Morrone,
2008; Rozell et al.,
2008; Vinje & Gallant,
2000,
2002; Willmore & Tolhurst,
2001) but because they are optimized to represent structure in natural scenes they would produce a less energy efficient population code when the goal is to veridically represent images lacking the required spatial structures. Of course it is not possible to directly measure neural population sparseness and veridicality of the image representation with the noninvasive recording methods we used in our study. However, based on the discussed theoretical considerations and experimental evidence we consider it safe to suppose that the manipulation of the pixel sparseness by increasing phase randomization of our stimuli transformed the neuronal population code in V1 from sparse to distributed as predicted by sparse coding theory. Importantly, this manipulation had no effect on the overall population activation level measured with the complementary imaging methods fMRI and MEG. This result implies that redundancy reduction in V1's neural population response does not reduce the metabolic demand of neural information coding. These are constraints that sparse coding models should meet to be biologically plausible. The independence of the population response level of Fourier phase and the dependence of the response level on Fourier amplitude we found support the hypothesis that at the population level, neural responses in V1 can be described in a good approximation as a linear shift invariant system (e.g., De Valois & De Valois,
1990; Movshon et al.,
1978). This conclusion is in concordance with contrast energy theory and theoretical work aiming to derive linear basis functions (receptive fields) for sparse coding of spatial information directly from natural scenes (Bell & Sejnowski,
1997; Field,
1987; Hancock, Baddeley, & Smith,
1992; Olshausen & Field,
1997; van Hateren & van der Schaaf,
1998).