Abstract
The objective of this study is to understand the relationship between the response properties of visual neurons and the statistical structure of natural time varying images. It has been proposed that sensory systems such as the human visual system reduce the redundancy in the sensory input to produce independent output (Attneave F 1954, Barlow HB 1961, Atick JJ 1992). Such a coding strategy gives the brain evolutionary advantages. Removing second order spatial and temporal correlations with the constraint of the minimum time delay leads to the filters which agree quantitatively with the receptive fields of retinal ganglion cell and LGN (Dong DW, Atick JJ 1995). Removing higher order spatial correlations leads to the emergence of oriented spatial filters (Olshausen BA, Field DJ 1996; Bell AJ, Sejnowski TJ 1997), which are similar to the simple cell's spatial receptive fields. But to quantitatively understand the receptive fields of simple cells, we need to take into account space and time simultaneously. In the previous studies the emergence of spatiotemporal filters through independent component analysis of TV videos was demonstrated (van Hateren JH, Ruderman DL 1998). But those spatiotemporal filters have markedly different characteristics from the spatiotemporal receptive fields of simple cells. Specifically the real biological neurons respond to visual inputs with much shorter time delays. It has been proposed that it is also vital, from the evolutionary point of view, for the sensory systems to respond to sensory inputs as soon as possible (Dong DW and Atick JJ 1995). We believe that this is another important principle for the organization of sensory systems. Adding the cost function of time delay to the independent component algorithm forces the emergence of oriented spatiotemporal filters with minimum time delay, which are closer to the simple cell's receptive fields.