Free
Research Article  |   January 2002
Receptive field structure of neurons in monkey primary visual cortex revealed by stimulation with natural image sequences
Author Affiliations
Journal of Vision January 2002, Vol.2, 2. doi:10.1167/2.1.2
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Dario L. Ringach, Michael J. Hawken, Robert Shapley; Receptive field structure of neurons in monkey primary visual cortex revealed by stimulation with natural image sequences. Journal of Vision 2002;2(1):2. doi: 10.1167/2.1.2.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Probing the visual system with the ensemble of signals that occur in the natural environment may reveal aspects of processing that are not evident in the neural responses to artificial stimulus sets, such as conventional bars and sinusoidal gratings. However, unsolved is the question of how to use complex natural stimulation, many aspects of which the experimenter cannot completely specify, to study neural processing. Here a method is presented to investigate the structure of a neuron’s receptive field based on its response to movie clips and other stimulus ensembles. As a particular case, the technique provides an estimate of the conventional first-order receptive field of a neuron, similar to what can be obtained with other reverse-correlation schemes. This is demonstrated experimentally and with computer simulations. Our analysis also revealed that the receptive fields of both simple and complex cells had regions where image boundaries, independent of their contrast sign, would enhance or suppress the cell’s response. In some cases, these signals were tuned for the orientation of the boundary. This demonstrates for the first time that it might be feasible to investigate the receptive field structure of visual neurons from their responses to natural image sequences.

Introduction
The goal of this project is to investigate how visual cortical cells respond to natural stimulation and to study what sort of signal processing occurs within primary visual cortex. We postulate that the use of natural image sequences may reveal aspects of cortical processing that are not evident when using simpler stimuli such as bars and luminance modulated gratings. Because the cortex is a nonlinear network, it may not be feasible to use the neural responses to simple stimuli to predict and understand cortical cells’ activity under natural stimulus conditions. Furthermore, accumulating evidence suggests that the surround of the “classical receptive field” of a neuron can modulate its response in very specific ways (see, Sillito, Grieve, Jones, Cudeiro, & Davis, 1995; Zipser, Lamme, & Schiller, 1996; Levitt & Lund, 1997; Walker, Ohzawa, & Freeman, 1999; Sceniak, Ringach, Hawken, & Shapley, 1999; Walker, Ohzawa, & Freeman, 2000; Kapadia, Westheimer, & Gilbert, 2000). It has been proposed that contextual modulation is the basis for figure and ground segregation (Knierim & van Essen, 1992; Zipser et al., 1996; Sillito et al., 1995), as well as grouping and segmentation (Kapadia et al, 2000; Chen, Kasamatsu, Polat, & Norcia, 2001). It has also been argued that the pattern of contextual interactions observed in V1 is what one would expect for a grouping network processing natural scenes (Siegman, Cecchi, Gilbert, & Magnascol, 2001). Thus, it is becoming increasingly important to understand how V1 neurons respond when their “classical receptive fields” are embedded in a natural surround. 
A first step necessary to attack these questions is to develop methods to study the structure of a neuron’s receptive field from its response to natural image sequences. Here we present a technique that allows one to investigate the input-ouput relationship of the cell by estimating a family of receptive-field “kernels” associated with particular “features” of the stimulus. As a particular case, the proposed method recovers the first-order kernel of a neuron with respect to the luminance of the visual stimulus (Marmarelis & Marmarelis, 1978). 
The data set collected in the present study consists of several movie segments that have been digitized and stored in the computer (the stimulus) and the corresponding responses of neurons in V1 to these movie clips. Our goal is to understand how neural activity in each case is influenced by the physical properties of the image sequences. The complexity of a natural stimulus introduces several challenges in the analysis. Most aspects of the stimulus are no longer under experimental control. Instead of varying one parameter at a time, as is customary in most experimental designs, a large number of physical properties are changing simultaneously. Explaining the full response of the cell to the movie sequences might be a very difficult task; instead, it is shown that one can readily test if a particular property of the stimulus influences the firing rate of a cell and to what extent. In this way, the richness of natural scenes can be exploited to explore which “features” of the image induce a cell to fire. 
We begin by considering cortical simple cells (Hubel & Wiesel, 1968; Movshon, Thompson, & Tolhurst, 1978a; Skottun et al., 1991). Simple cells are thought to provide oriented, spatially bandpass filtering that is one of the essential early stages in visual processing (DeValois & DeValois, 1988). A common mathematical model describing the function of simple cells, under a constant level of contrast gain control, is a linear operator followed by rectification (Movshon et al., 1978a; Tolhurst & Dean, 1990; Tolhurst & Heeger, 1997; Carandini, Heeger, & Movshon, 1997). Numerous studies of V1 simple cells using bars, spots or gratings have established that their receptive field structure can be approximated by spatially discrete antagonistic subregions (Hubel & Wiesel, 1968; Movshon et al., 1978; Andrews & Pollen, 1979, Jones & Palmer, 1987; Parker & Hawken, 1988). If spatial summation of neural signals by simple cells is linear, then different stimulus ensembles could be used to determine the first-order linear receptive field or kernel of these neurons (Victor, 1992). Furthermore, invariance of the resulting kernel with respect to the stimulus ensemble provides one way in which the linearity of simple cells could be tested (Ringach, Sapiro, & Shapley, 1997b). Using this method, we show that the first-order kernel of simple cells can readily be recovered from their responses to movie clips. 
One must consider also that visual cortical cells could be responding to some nonlinear feature of the visual stimulus. Certainly complex cells respond in a nonlinear manner to luminance contrast (Hubel and Wiesel, 1968; Movshon et al., 1978b; Spitzer & Hochstein, 1985; DeValois et al., 1982; Szulborski & Palmer, 1990). To what other nonlinear features of the stimulus are V1 cells responding? The concept of a “feature map” of a stimulus is introduced in this paper as a way of studying responses of visual neurons to different attributes in natural image sequences. In essence, the method estimates the best linear predictor of the cell’s response given a particular feature map of the stimulus. The initial results indicate that V1 neurons have a variety of complex responses to natural images, and that sophisticated image processing might be occurring in the V1 cortex. 
Methods
Physiology, Optics and Visual Stimulation
Acute experiments were performed on adult Old-World monkeys (Macaca fascicularis) in compliance with National Institutes of Health guidelines as described elsewhere (Ringach et al., 1997a). Natural image sequences were generated by digitally sampling commercially available videotapes in VHS/NTSC format. A Silicon Graphics R10000 Solid Impact was used to sample frames at a spatial resolution of 320 × 240 pixels (6 deg × 4.5 deg of visual angle) and at a temporal rate of 15 Hz. The selected movies included both man-made and natural landscape scenes. Six segments of 30-s duration were sampled from eight different movies, making a total of 24 minutes of video. The movies were compressed using Silicon Graphics’ MVC2 compression scheme (proprietary) and stored on a disk. The compressed data fitted in 480 megabytes of memory. A Silicon Graphics O2 R5000 computer played back the images during the experiment on a computer screen that measured 34.3 cm wide by 27.4 cm high. The refresh rate of the monitor was 100 Hz and each movie image was presented for six consecutive frames. Thus, the effective playback rate was 16.6 Hz—slightly faster than the sampling rate. The mean luminance of the display was 56 cd/m2. Stimulation was monocular to the dominant eye (the other eye was occluded). Movie clips were effective in evoking responses from V1 cells; the mean spike rate to natural image stimulation was ≈10 spikes/s. We think these movie clips have statistics similar to those used in other studies of natural image sequences, such as those employed by van Hateren and van der Schaaf, (1998) who sampled videos from Dutch, British and German TV broadcasts. 
Experimental Protocol
Each cell was stimulated monocularly via the dominant eye and characterized by measuring its steady-state response to conventional black/white drifting sinusoidal gratings (the nondominant eye was occluded). With this method we measured basic attributes of the cell, including spatial and temporal frequency tuning, orientation tuning, contrast, and color sensitivity (Johnson, Hawken, & Shapley, 2001), as well asarea, length, and width tuning curves. Experiments using natural image sequences were performed following these standard measurements. Steady-state orientation tuning curves were obtained using angular steps of 15 deg or 20 deg. In a few very sharply tuned cells, we used steps of 10 deg. Simple cells are defined as those neurons whose responses had ratios of first harmonic to mean response larger than one when stimulated with a drifting grating having optimal spatio-temporal parameters. All other cells are defined as being complex. Receptive fields were located at eccentricities between 1 and 6 deg. 
Analysis
The question that we want to address is how the response of a neuron depends on the recent history of the movie sequence. We propose a method that is an extension of one recently introduced by DiCarlo, Johnson, and Hsiao (1998) to analyze receptive fields in area 3b of primary somatosensory cortex in response to random dot patterns (see also the recent work of Theunissen et al. [2001]). The following terminology will be used. Let I(x,y,t) denote the value of a pixel at location (x,y) and time t. This is normally a three-dimensional vector representing the values of the red, green and blue components of the pixel. For the response of the cell we consider the total number of spikes occurring within a time window Δt centered at time t. This value is denoted by r(t). 
Formally, the general problem is to determine how the response, r(t), depends on the recent history of the visual stimulus s() = {I(x,y,t′)‖ tT < t′ < t { where T is the width of the analysis window. This relationship is fully characterized by the joint probability of the stimulus and the response P(s,r) (Rieke, Warland, van Steveninck, & Bialek, 1997). Due to the high dimensionality of the stimulus space, however, estimating this probability distribution is not possible in the given experimental time. Instead, methods that make specific assumptions about the relationship between stimulus and response are required. Here we consider a general class of models described by  
(1)
where Image not available is the mean response rate, Φ(x,y,t) represents a feature map sequence, and w(x,y.t) are weights representing a spatio-temporal kernel of the receptive field. The feature map is a function (linear or nonlinear) of the input image sequence, I(x,y,t). Therefore, the cell’s modulation away from its mean response Image not available is modeled as a linear spatio-temporal filter acting on the feature map sequence. The choice of Φ is limited only by our intuition about what “features” of the image sequence the cell at hand may be representing. 
For example, one of the feature maps considered below is the luminance contrast map. The “luminance contrast map” is defined by   where L(x,y,t) = wTI(x,y,t) is the luminance of the pixel at location (x,y) at time t, and Image not available is the mean luminance of the frame at time t. The luminance of a pixel is obtained by weighting the values of the red, green, and blue guns appropriately (which is achieved by multiplying with a vector w obtained from the calibration of the display). As an example of this calculation, we present original color frames from the stimulus in Figure 1a, and their associated luminance contrast map Figure 1b. With this definition of the feature map, the modulation of the cell’s response is modeled as a linear function of the luminance contrast values within its receptive field—a commonly used model for simple cells. 
Figure 1
 
Examples of feature maps obtained from the original frames in the movie. (a) Three still frames taken from the movie “Sleeper.” (b) The luminance-contrast maps associated with the original images. Regions in white indicate positive values of contrast, whereas regions in black indicate negative values. (c) The edge map associated with the original images. Locations where large gradients in the luminance contrast map are located are emphasized. (d) Oriented edge maps associated with the original images when θ = 0; this choice accentuates oriented boundaries that are near vertical. (e) Oriented edge maps associated with the original images when θ = π/2 this emphasizes oriented edges that are near horizontal.
Figure 1
 
Examples of feature maps obtained from the original frames in the movie. (a) Three still frames taken from the movie “Sleeper.” (b) The luminance-contrast maps associated with the original images. Regions in white indicate positive values of contrast, whereas regions in black indicate negative values. (c) The edge map associated with the original images. Locations where large gradients in the luminance contrast map are located are emphasized. (d) Oriented edge maps associated with the original images when θ = 0; this choice accentuates oriented boundaries that are near vertical. (e) Oriented edge maps associated with the original images when θ = π/2 this emphasizes oriented edges that are near horizontal.
A second feature map of interest is given by  
(2)
where Image not available. This value represents the absolute value of the luminance contrast gradient, which is large in those regions where boundaries are present in the image. Thus, the result of this computation may be considered an “edge map” (Pratt, 1991). It can be seen that the edge map (Figure 1c) associated with the original images (Figure 1a) emphasizes local changes in contrast. This definition of the “edge map” is insensitive to the local contrast sign of the contour or its orientation. Clearly, the edge map is a nonlinear operator on the luminance of the images. 
In some situations it is of interest to separate the contributions of edges at different orientations to the cell’s response. This may be done by defining an oriented edge map as follows,  
(3)
This feature map emphasizes edge boundaries whose orientations are normal to the selected orientation. For example, the selection θ = 0 accentuates vertical boundaries in the image (Figure 1d). Similarly, the oriented edge map where θ = π/2 emphasizes horizontal boundaries (Figure 1e). This measure is also insensitive to contrast sign. 
Once a feature map Φ(x,y,z) is selected, we want to find the optimal spatio-temporal weighting function, or kernel, w(x,y,t) that predicts the response of the cell in the least squares sense according to the model in Equation (1). When the input is white noise stimuli, one computes this kernel by calculating the mean input before a spike (Lee & Schetzen, 1965; deBoer & Kuyper, 1968). This computation does not apply to natural images because there are strong spatio-temporal correlations in the input sequence and resulting feature maps. The autocorrelation of the input must be taken into account. To do this, we used a standard technique, recursive least-squares (RLS), to calculate the optimal kernel. The input to the algorithm is the feature map sequence and the response of the cell. The output is the optimal kernel that transforms the feature map into the response. This is done via a recursive procedure that refines our guess of the kernel as more and more data are added to the calculation. At the first step of the calculation, the weights (kernel values) are all set to zero. At the nth step of the calculation, the old estimate of the kernel, at step n − 1, is used to predict the response of the cell. The error between the predicted and true response is used to make a correction to the weighting function and generate a new estimate. The correction in the RLS algorithm is computationally complex but basically it is the present input image filtered so as to correct for image correlation, and is weighted by the magnitude of the error. It can be shown that the expected value of the RLS algorithm’s estimate is equal to the true value of the kernel (Haykin, 1991). 
Some of the advantages of the recursive least squares technique, over standard least squares, are as follows. First, in contrast to the standard least-squares technique, there is no need to invert the (very large) correlation matrix of the input data at any stage in the algorithm. Instead, a recursive estimate of the inverse of the correlation matrix is updated as new data arrive (Haykin, 1991). This is important when the condition number of the correlation matrix is high, as is the case for the application at hand. A high condition number implies that inverting the matrix is not a numerically stable process (Golub & van Loan, 1989). Second, the technique is recursive, so estimates can be updated as new data are collected. This could help us decide when sufficient data have been gathered on a particular cell as we run the experiment. Third, slow trends in excitability, due to variations in anesthetic levels and the physiology of the animal can be factored out by a recursive estimation of the mean values. Fourth, such a technique could be used in principle to follow changes of the receptive field with time when the cell is presented with nonstationary input. Thus, in principle, the technique could allow the study of adaptation to changes in the statistics of natural images. A detailed description of the algorithm is provided in the “1.” 
Results
To test the performance of the algorithm, we first determined whether the method could recover the classical first-order kernel in a model V1 cell consisting of a cascade of a linear receptive field (acting on the luminance contrast of the input) and a threshold nonlinearity (see Figure 2). 
Figure 2
 
Performance of method on a simulated simple cell. The system consists of a cascade of a spatial linear filter (acting on the luminance contrast of the input) followed by a hard-step threshold nonlinearity. The signal z(n) represents Gaussian additive noise. The simulated receptive field had two subfields, one excitatory (in red) and one inhibitory (in blue). The algorithm had to estimate this receptive field given the input image sequence and the response of the cell, r(n). The result of applying the method is shown below the image of the simulated receptive field.
Figure 2
 
Performance of method on a simulated simple cell. The system consists of a cascade of a spatial linear filter (acting on the luminance contrast of the input) followed by a hard-step threshold nonlinearity. The signal z(n) represents Gaussian additive noise. The simulated receptive field had two subfields, one excitatory (in red) and one inhibitory (in blue). The algorithm had to estimate this receptive field given the input image sequence and the response of the cell, r(n). The result of applying the method is shown below the image of the simulated receptive field.
At each time step, the dot product between the simulated receptive field and the input image was computed first (the receptive field was centered on the movie frame). This value is denoted by y(n) (Figure 2). Next, in an attempt to make the simulation realistic, y(n) was perturbed by a large amount of additive Gaussian noise, z(n). The standard deviation of y(n) and z(n) were equal, i.e., the signal-to-noise ratio was one. Finally, the resulting signal w(n) = y(n) + z(n) was passed through a hard rectifier (Figure 2, right). The threshold was set at a value that caused the model cell to “fire” (i.e., generate a nonzero output) only 12% of the time. This is equivalent to a mean response rate of ≈ 2 spikes/s. The output variance was 2.1 (spikes/s).2 These numbers are close to the median values for our data: median response 2.4 spikes/sec and variance (2.3 spikes/s).2 The movies used in the simulation, and the length of the data record, were identical to those in the actual experiment. The simulated receptive field had two symmetric subfields, one excitatory (indicated in red) and one inhibitory (indicated in blue), and was defined on a square grid of 17 × 17 pixels representing 0.65 deg × 0.65 deg of visual angle. These parameters were selected to test the proposed method under stringent conditions: the algorithm had to estimate 289 parameters from very noisy thresholded data in the presence of highly correlated input signals (the condition number [Golub & van Loan, 1989]) of the luminance covariance matrix was ≈ 3 × 103. The resulting estimate of the receptive field is very good (Figure 2, lower receptive field): the correlation coefficient between the true and estimated weights equals 0.88. Thus, the algorithm can perform very well even in the presence of strong output nonlinearity and large additive noise levels. 
The analysis was then applied to study the structure of receptive fields in 22 cells of macaque V1. As described in ”Methods,” this was done by having the model predict the response of the cell at time t + τ given the feature map at time t. A fixed delay of τ = 60 ms, which corresponds to the median time-to-peak in our V1 population (Ringach et al., 1997a), was used for all cells. Representative results are shown in Figure 3. Each panel in the figure corresponds to a different cell and depicts the luminance contrast kernel on the left, and edge kernel on the right. Regions in red correspond to positive values of the kernel; those in blue represent negative values. For comparison, the optimal stimulus orientation obtained with drifting sinusoidal gratings is shown by the orientation of the bar on top of the kernels for each cell. 
To check even more rigorously whether or not the kernels recovered with the RLS algorithm characterize the visual function of the V1 neurons, we mapped the luminance-contrast kernel in a few V1 simple cells using natural images and a more conventional reverse correlation technique (Ringach et al., 1997b). Figure 4a and 4b illustrate the results in two V1 neurons. The receptive field on the left panel corresponds to the estimate obtained using standard reverse correlation, and the panel on the right shows the kernel estimated from stimulation with natural image sequences. Both methods provide similar estimates. In addition, the luminance contrast kernel of V1 cells obtained from its responses to the movie clips has often elongated excitatory and inhibitory subfields (Figure 3). We compared the axis of elongation in the kernels with the preferred orientation of the cell estimated from the response tuning as a function of orientation for drifting sinusoidal gratings (Figure 4c). The axis of elongation of the strongest subfield was determined by calculating the eigenvalues and eigenvectors of the (centered) second order moment matrix of absolute values of the kernel for that subfield. The direction of the largest eigenvector provides the axis of elongation, and the ratio between the largest and the smallest eigenvalue gives the aspect ratio of the subfield. Figure 4c shows that the axis of elongation in the kernels matches the preferred orientation estimated from the steady-state orientation-turning curve. 
Figure 3
 
Analysis of receptive field structure using natural image stimulation. Each panel in this figure shows the estimated luminance contrast kernel (on the left) and the edge kernel (on the right) for several V1 cells. Additional information is displayed on top of each panel: the cell’s laminar location, the ratio between the first-harmonic component and the mean of the response (F1/F0) for the optimal sinusoidal grating stimulus followed by the classification of the cell as simple (F1/F0 > 1) or complex (F1/F0 ≤ 1) the angular size represented by one side of the 17 × 17 grid, and the preferred orientation of the cell as measured with conventional drifting gratings. The orientation of the bar corresponds to the orientation of the grating that generated the best response. In (l) the bar was omitted because the cell was not well tuned. In most cases we observe that the preferred orientation of the cell closely matches the axis of elongation of the estimated receptive fields. Each kernel was normalized independently so that its maximum absolute value was one. This makes optimal use of the pseudo-color map which ranges from −1 (blue) to +1 (red).
Figure 3
 
Analysis of receptive field structure using natural image stimulation. Each panel in this figure shows the estimated luminance contrast kernel (on the left) and the edge kernel (on the right) for several V1 cells. Additional information is displayed on top of each panel: the cell’s laminar location, the ratio between the first-harmonic component and the mean of the response (F1/F0) for the optimal sinusoidal grating stimulus followed by the classification of the cell as simple (F1/F0 > 1) or complex (F1/F0 ≤ 1) the angular size represented by one side of the 17 × 17 grid, and the preferred orientation of the cell as measured with conventional drifting gratings. The orientation of the bar corresponds to the orientation of the grating that generated the best response. In (l) the bar was omitted because the cell was not well tuned. In most cases we observe that the preferred orientation of the cell closely matches the axis of elongation of the estimated receptive fields. Each kernel was normalized independently so that its maximum absolute value was one. This makes optimal use of the pseudo-color map which ranges from −1 (blue) to +1 (red).
Figure 4
 
(a,b) Comparison of receptive fields mapped with natural image sequences (right panel) and with subspace reverse correlation (left panel) for two V1 cells. (c) Scatter plot of the cell preferred orientation as measured with steady-state drifting gratings (x-axis) versus the angle of elongation of the strongest subfield in the kernels for the cells in Figure 3 (a–k). Open squares represent the elongation of subfields in the luminance contrast kernel and open circles represent the elongation for the edge map kernels. Cases where the ‘aspect ratio’ of the subfield defined by the ratio between the largest and smallest eigenvalues of the (centered) second order moment matrix was less than 1.2 were ignored. A small aspect ratio could result because of noise in the kernels (such as Figure 3i, left panel) or because the subfield was round (such as Figure 3l, right panel).
Figure 4
 
(a,b) Comparison of receptive fields mapped with natural image sequences (right panel) and with subspace reverse correlation (left panel) for two V1 cells. (c) Scatter plot of the cell preferred orientation as measured with steady-state drifting gratings (x-axis) versus the angle of elongation of the strongest subfield in the kernels for the cells in Figure 3 (a–k). Open squares represent the elongation of subfields in the luminance contrast kernel and open circles represent the elongation for the edge map kernels. Cases where the ‘aspect ratio’ of the subfield defined by the ratio between the largest and smallest eigenvalues of the (centered) second order moment matrix was less than 1.2 were ignored. A small aspect ratio could result because of noise in the kernels (such as Figure 3i, left panel) or because the subfield was round (such as Figure 3l, right panel).
The statistical significance of the kernels was evaluated as follows. First, to obtain an estimate of the noise in the measurement we calculated the standard deviation of the kernel values in pixels located away from the receptive field. Then, the kernel was normalized by the standard deviation of the noise. The result of this calculation is a z-transformed kernel (Zar, 1996). Figure 5 replots the z-transformed values of some of the kernels in Figure 3. Here the color map ranges from a z value of −10 (blue) to +10 (red). Thus, the maximum value in this scale corresponds to a kernel amplitude that is 10 times the standard deviation of the noise. All kernel features discussed below, both in the luminance and edge maps, had peak absolute z values larger than 4. This implies a significance level of p < 7 × 10−5
Figure 5
 
Evaluating the statistical significance of kernel features. The figure shows the z-transformed values of some of the kernels depicted in Figure 3. All features described in the text had peak absolute z values larger than 4.
Figure 5
 
Evaluating the statistical significance of kernel features. The figure shows the z-transformed values of some of the kernels depicted in Figure 3. All features described in the text had peak absolute z values larger than 4.
In the kernels mapped using natural image sequences, we observed simple cells that had structure in both the luminance contrast map and in the edge kernel map (Figure 3a,3c,3e, and 3f). Similarly, some complex cells also exhibited structure in their luminance contrast kernels (Figure 3b,3d,3g, and 3h, left panel), whereas others did not (Figure 3i–3l, left panel). Another salient feature of the data is that all cells, both simple and complex, showed spatial structure in their edge kernels. The structure of the kernels for a direction selective simple cell in layer 6 illustrates how this method of analysis can reveal complexities in the organization of a receptive field (Figure 3a). It can be seen that the luminance contrast kernel shows two elongated subfields; one excitatory and one inhibitory (Figure 3a, left panel). The preferred orientation of the cell, as measured with drifting sinusoidal gratings, is shown at the top right of Figure 3a and closely matches the luminance-contrast kernel’s orientation. 
The spatial structure observed in the kernel associated with the edge map was unexpected (Figure 3a, right panel). This kernel displays primarily two slightly elongated subfields. One field is excitatory, indicating that high values of luminance contrast gradients in that region induced the cell to respond more. The second subfield is inhibitory; it indicates that image boundaries in that region, independent of their contrast sign, suppressed the response of the cell. The cell’s preferred direction of motion, as determined with drifting gratings, was from the excitatory toward the inhibitory subfield. Notice also that the edge kernel appears to be slightly displaced with respect to the center of the luminance-contrast kernel and has a somewhat greater spatial extent. A weaker suppressive region, located to the left of the excitatory region, may also be seen. The spatial structure seen in the edge kernel reveals the presence of a contrast independent (nonlinear) signal that modulates the response of this simple cell. 
The result obtained in a complex cell from layer 4C is shown in Figure 3b. A luminance contrast kernel with two parallel subfields of opposite signature in the luminance contrast kernel was detected (Figure 3b, left panel). The preferred orientation of the cell as determined with drifting sinusoidal gratings matches the axis of elongation of the subfields. The edge kernel has a single excitatory subfield centered at the same location as the excitatory subfield of the luminance-contrast kernel but extending further in space (Figure 3b, right panel). We observed several cases in which cells (both simple and complex) showed two subfields of opposite signs in the luminance-contrast kernel and a single excitatory subfield in the edge kernel (Figure 3c,3e, and 3f). 
In some cases, it appears that selectivity for orientation is conferred to the neuron by a contrast-insensitive signal. The luminance contrast kernel obtained from an “on”-center cell in layer 6 appears isotropic in space (Figure 3g, left panel). The edge kernel, on the other hand, is slightly elongated (Figure 3g, right panel). The axis of elongation corresponds well with the preferred orientation as measured with drifting gratings. The neuron was well tuned for orientation; its tuning curve had a half-bandwidth at half-height of 25 deg. One would not expect this cell to be orientation tuned based on the measurement of the luminance-contrast kernel alone. 
In Figure 3h we present the result from an “off-center” cell in layer 4B. Notice that the excitatory field in the edge kernel is slightly displaced in space with respect to the luminance contrast kernel. 
The structure of the edge kernel reveals information about the organization of the receptive field in complex cells that do not have measurable luminance contrast kernels. A subset of complex cells showed weak or no spatial structure in their luminance contrast kernels (Figure 3i–3l). The kernels estimated with respect to the edge map, on the other hand, have obvious spatial structure in them. Figure 3i illustrates the analysis of a direction selective complex cell in layer 6. The preferred direction of movement, as determined with drifting gratings, was from the excitatory towards the smaller inhibitory subfield (Figure 3i, right panel). Other cells in this group had a single excitatory field in their edge kernels. In some cases, the field was elongated and matched the preferred orientation of the cell (Figure 3j and 3k, right panel); other cells had nearly circular fields (Figure 3l, right panel). 
Finally, the effect of oriented image boundaries on the response of the cell can be studied by estimating spatial kernels with respect to the “oriented edge map.” Figure 6 shows examples from three complex cells. In each case four different kernels are depicted. In left to right order they are the luminance contrast kernel, the edge kernel, the oriented-edge kernel when the angle θ was selected to emphasize edges with orientations similar to the preferred orientation of cell, and the oriented-edge kernel when the angle θ was selected to accentuate boundaries at the orthogonal orientation. 
Figure 6
 
The use of oriented edge maps to analyze the contribution of different orientations to the cell’s response. Each panel in this figure shows four kernels. In left to right order they represent: the luminance contrast kernel, the edge kernel, the oriented-edge kernel when the angle θ was selected to emphasize edges with orientations similar to the preferred orientation of cell, and the oriented-edge kernel when the angle θ was selected to accentuate boundaries at the orthogonal orientation. These are all complex cells, and have weak structure in their luminance contrast kernels.
Figure 6
 
The use of oriented edge maps to analyze the contribution of different orientations to the cell’s response. Each panel in this figure shows four kernels. In left to right order they represent: the luminance contrast kernel, the edge kernel, the oriented-edge kernel when the angle θ was selected to emphasize edges with orientations similar to the preferred orientation of cell, and the oriented-edge kernel when the angle θ was selected to accentuate boundaries at the orthogonal orientation. These are all complex cells, and have weak structure in their luminance contrast kernels.
The edge kernel in Figure 6a indicates the presence of three subfields: a small central excitatory subfield flanked by two elongated inhibitory subfields. The kernels estimated with respect to the oriented edge maps show that the central excitatory mechanism arises from boundaries having the same orientation as the one preferred by the cell. The inhibitory subfields, in contrast, result from boundaries orthogonal to the preferred orientation. This means that in these regions of space, edges perpendicular to the optimal orientation for the cell suppress its response. This could be a manifestation of cross-orientation inhibition (Morrone, Burr, & Maffei, 1982). Thus, such analysis of responses to natural images may provide a way to understand how the spatial arrangements of oriented edge segments influence the response of the cell. 
In other complex cells, the kernels calculated with respect to the edge map and the oriented-edge map for the preferred orientation are similar and show a single excitatory region (Figure 6b and 6c). In contrast, the orthogonal edge maps appear to be more diffuse and peak in different spatial locations. One may conjecture that such receptive field structures may underlie the enhanced responses of cells to orientation contrast (Knierim & van Essen, 1992; Sillito et al., 1995). 
Discussion
The experimental results indicate that natural images can be used successfully to probe the visual properties of neurons. This method proved to be successful in obtaining the two-dimensional first-order kernel, or luminance-contrast feature map, in all simple cells (Figure 3). The orientation of the luminance feature map was consistent with the orientation tuning measured with grating stimuli (Fig 4c), cells showed parallel antagonistic subregions, and different spatial scales were evident among the population. In a couple of simple cells we compared the kernels obtained via stimulation with natural image sequences with those measured using subspace reverse correlation (Ringach et al., 1997b) and the results were similar. These findings, together with the simulation results, suggest that the proposed method works as expected. 
Some, but not all, complex cells showed luminance-contrast kernels, indicating that such neurons did receive excitatory input from “first-order” neural mechanisms. Thus, the analysis of the responses to the movie sequences verifies that they can be used to give us a two-dimensional spatial map of the receptive field. Of course, the results shown here are only a snapshot at a single time frame. The method can be extended to provide the full spatio-temporal kernel by having the input to the algorithm represent a recent spatio-temporal volume of the feature map sequence. This would require the estimates of more parameters and, as a consequence, more data to obtain a reliable answer. 
Using the new method, we were able to demonstrate contrast invariant edge kernels in both simple and complex cells. Contrast invariant edge kernels in simple cells have not been previously described. The model cell described in Figure 2 did not show suppressive regions in the edge kernel in response to stimulation with the movie sequences. In some conditions, the model receptive field in Figure 2 did predict a single excitatory region in the edge map kernel, centered between the two subfields of the luminance contrast kernel. Therefore, we can conclude that an oriented linear filter with a threshold predicts most of the luminance and part of the edge kernel in V1 simple cells, but does not predict both the position of some excitatory regions and the suppressive regions. The analysis of complex cell kernels into orientation specific edge response indicates that while the excitatory region arises from the preferred orientation of the cell (as measured by gratings), some of the antagonistic regions arise from orthogonal orientations (Figure 4a). We believe that this nonlinear suppression represents a novel feature of the receptive field organization whose spatial extent and orientation tuning have not been previously characterized. 
The method proposed here allows for the calculation of the neuron’s kernels with respect to different feature maps, such as a luminance map and the “edge” map (Figure 3). We noted a number of cases where the neural response was correlated with both maps. In principle, such a result could be simply due to the fact that the maps themselves are correlated. A principled way to deal with correlated feature maps is dictated by linear regression theory. If “main effects” are found with respect to two different feature maps, Φ1 and Φ2, a next step would be to build a compound model by defining a new feature map that represents the concatenation of Φ1 and Φ2, Φ = [Φ1Φ2, and run the same algorithm which will compute new kernels with respect to these two maps taking into account any possible cross-correlations. If the feature maps are approximately orthogonal (i.e., uncorrelated), the resulting kernels are clearly the same as those obtained by doing a regression on each feature map individually. This is the case in this study, as the maximal cross-covariance between the luminance-contrast and “edge” map was very small, 0.04, meaning that the maps are nearly orthogonal. As a consequence, estimating the maps separately is justified in our case. We note that the responses of cells to image attributes other than luminance contrast is consistent with previous data showing that cortical cells may respond to image boundaries that are not defined by luminance cues alone, such as illusory contours (Grosof, Hawken, & Shapley, 1993) and second order motion (Mareschal & Baker, 1999). Thus, we do not believe this phenomenon arises only when using natural image stimulation. 
We envision similar techniques to the “feature map” approach proposed here as potential tools in psychophysical research. In the response classification images method (Beard & Ahumada, 1998), the noise is uncorrelated in space. To be able to use correlated noise (such as bandpass filtered white noise), averaging of the noise samples is not the right calculation to estimate the kernel (or classification image). Instead, the average classification images should be premultiplied by the inverse of the noise cross-correlation matrix, as is effectively done by the algorithm in this study. Also, multiple features (besides the luminance of the images) may mediate performance in a particular psychophysical task. The technique described here could allow the investigator to explore such dependencies. 
It is unknown at present if the kernels obtained using natural image stimulation are identical to or different from kernels derived from other stimulus ensembles, such as bars, spots of light, or sinusoidal gratings (Ringach et al., 1997b). The data we have collected so far indicate that mapping receptive fields with subspace reverse correlation (which uses spatial grating stimuli) and with natural image sequences yields similar results (Figure 4). A detailed comparison, however, requires a larger data set than the one we now have. It is also unknown if other feature maps, involving more elaborate two-dimensional features, such as corners and junctions, would be better correlated with the responses of some neurons. We plan to exploit the method to address these interesting questions in future work. 
It will also be important in the future to address some of the weaknesses of the technique. In the present experiments we presented only one trial per movie segment. In part, this was due to the fact that it was unknown in these initial experiments using natural image stimulation, how much data would be required to estimate the receptive fields. Because the firing rates are relatively low, it is difficult to obtain from such data sets reliable estimates of the instantaneous firing rate of the neuron as well as the noise in its response. This is unfortunate, as these numbers are required to calculate the amount of response variance explained by each of the kernels. To do so, it will be necessary to measure a number of repeats for each trial. Another weakness of the present approach is that the linear model in equation (1) is not entirely satisfactory as it ignores nonlinear operations that we know are present in simple cells of V1, such a cortical gain control (Carandini et al., 1997). Identifying nonlinear models from the responses of neurons to natural stimulation is one area for future research. 
Theoretical studies have argued that a critical component in understanding how the brain processes sensory information is to investigate the statistical properties of the signals encountered in the natural environment (Field, 1987; Tolhurst, Tadmor, & Chao, 1992; Olshausen & Field, 1996; Olshausen & Field, 1997; Dong & Atick, 1996; Bell & Sejnowski, 1997; van Hateren, 1998). A complementary line of research is to explore how the cortex processes this particular ensemble of signals. The use of natural stimuli to study the physiology of the visual system has up to now been limited (van Hateren, 1987; Dan, Atick, & Reid, 1996; Baddeley et al., 1997; Gallant, Connor, & van Essen, 1998). Here we showed, for the first time that it is experimentally feasible to measure the receptive field structure of visual neurons from their responses to natural image sequences. This methodology may pave the way to evaluating the similarities and differences in visual cortical processing when the cortex is faced with stimulus ensembles of varying complexity. The method may also generalize the classification image technique so that correlated noise and multiple feature maps can be used in the study of human psychophysical performance. 
Acknowledgments
This research was supported by National Institutes of Health Grants EY-12816 (D.L.R.), EY-08300 (M.J.H.), and EY-01472 (R.S.), and a Sloan Foundation grant to New York University in support of their Theoretical Neuroscience Program. Commercial relationships: None. 
Appendix A
In this paper we restrict the analysis and attempt to predictthe response at time t + τ from the feature map at time t. The response at time t + τ was defined as the total number of spikes in the segment. We picked a window width of Δt = 60 ms and used a fixed delay τ =60 ms (this is the average delay in our population of V1 cells (Ringach et al., 1997a). The central portion of the feature map was subsampled on a square grid of 17 × 17. The visual area represented by this grid was varied from cell to cell to make sure it covered their receptive fields. These data were arranged in a column data vector, u(t), having 289 entries. 
A variant of the recursive least-squares (RLS) algorithm was implemented in Matlab (Mathworks, Natick, MA) to process the data. The analysis was run on an SGI Onyx 2. The algorithm is described in Table 13.2 in Haykin (1991)). Essentially, it consists of two main steps: forward prediction and adaptation. The forward prediction stage is when the present estimate of the kernel is used to predict the neuron’s response and errors in prediction are computed. The adaptation step is when the kernel estimate is updated with a correction factor to bring the estimate closer to the true kernel. It is in the computation of the correction factor that the correlations in the image statistics enter the algorithm. The mathematical derivation of the algorithm can be found in Haykin (1991). Pseudo code follows: 
Figure 7
 
Pseudo-code of a modified RLS algorithm used to compute the optimal linear kernels in this study.
Figure 7
 
Pseudo-code of a modified RLS algorithm used to compute the optimal linear kernels in this study.
Here, the variables have been discretized in space and time: I(i,j,n) represents the image at location (i,j) for the n-th stimulus frame in the movie sequence and similarly for the other variables. The variable w(n) is an N × 1 vector representing the estimate of the weights at time step n. When we begin the process we have no data, so we set the initial value of w to zero (line 1). N is the total number of parameters to be estimated. In our case we have N = 289 parameters. P(n) is an N × N matrix representing a recursive estimate of the inverse of the correlation matrix, and δ = 0.00001 is a small number. Two modifications were done to the standard RLS algorithm. First, we added a recursive estimate of the mean response of the cell µ that is subtracted from the response r(n) at each step (lines 8 and 9). This is done to factor out slow trends in the excitability of the neuron, as we are only interested in explaining departures in the response of the cell from its mean. The forgetting factor β = 0.99 corresponds to a time constant of ≈ 6 s. A second modification is the spatial smoothing of the estimated coefficients in lines 18 and 19. The standard RLS algorithm does not include any knowledge about the spatial relationship between the different coordinates. Here, we chose to smooth the estimates with a 3 × 3 pixel Gaussian kernel every Q = 450 frames (equivalent to 30 s of video). The smoothing kernel had a time-varying width given by σ(n) = 0.4 + 5Q/n pixels. Our simulations indicated that adding this sort of “annealing” smoothing step increases the convergence rate of the algorithm. 
There are important convergence results of the RLS algorithm that are worth mentioning here (Haykin, 1991). First, the estimate of w is expected to converge on the mean. In other words, the estimation of the receptive field is unbiased. Second, the variance of the prediction error converges to the variance of the noise in the system, i.e., under the assumption that the response of the neuron is contaminated by independent noise, the variance of the response prediction and the true response are equal. Thus, we are guaranteed that the model in Equation (1) will match both the mean and variance of the neural response. 
One way to experimentally investigate the convergence of the algorithm when the true value of w is unknown (such as when we apply the algorithm to real data) consists in calculating the relative change in the norm of w after Q steps of the RLS algorithm:  
(4)
 
The magnitude of changes in w will never decrease beyond a lower bound set by the noise in the system. Thus, we expect Δw(n) to decrease and asymptote at some finite value. At this point we considered the algorithm to have converged on the mean. After this time the values of w(n) may be averaged to yield more accurate estimates. In the population of cells studied the algorithm converged, on average, after 15 minutes of video. Finally, in those cases where the calculation of the feature map required an estimate of the gradient, Sobel operators were used (Pratt, 1991 
References
Andrews, B. W. Pollen, D. A. (1979). Relationship between spatial frequency selectivity and receptive field profile of simple cells. Journal of Physiology, 287, 163–176. [PubMed] [CrossRef] [PubMed]
Baddeley, R. Abbott, L. F. Booth, M. C. Sengpiel, F. Freeman, T. Wakeman, E. A. Rolls, E. T. (1997). Responses of neurons in primary and inferior temporal visual cortices to natural scenes. Proceedings of the Royal Society of London. Series B: Biological Sciences, 264, 1775–1783. [PubMed] [CrossRef]
Beard, B. L. Ahumada, A. J. (1998). A technique to extract relevant image features for visual tasks. Human Vision and Electronic Imaging III, SPIE Proceedings, 3299, 79–85.
Bell, A. J. Sejnowski, T. J. (1997). The “independent components” of natural scenes are edge filters. Vision Research, 37, 3327–3338. [PubMed] [CrossRef] [PubMed]
Carandini, M. Heeger, D. J. Movshon, J. A. (1997). Linearity and normalization in simple cells of the macaque primary visual cortex. Journal of Neuroscience, 17, 8621–8644. [PubMed] [PubMed]
Chen, C. C. Kasamatsu, T. Polat, U. Norcia, A. M. (2001). Contrast response characteristics of long-range interactions in cat striate cortex. Neuroreport, 12, 655–661. [PubMed] [CrossRef] [PubMed]
Dan, Y. Atick, J. J. Reid, R. C. (1996). Efficient coding of natural scenes in the lateral geniculate nucleus: Experimental test of a computational theory. Journal of Neuroscience, 16, 3351–3356. [PubMed] [PubMed]
DeBoer, E. Kuyper, P . (1968). Triggered correlation. IEEE Transactions on Biomedical Engineering 15, 169–179. [PubMed] [CrossRef] [PubMed]
DeValois, R. DeValois, K. K. (1988). Spatial Vision. New York: Oxford University Press.
DeValois, R. L. Albrecht, D. G. Thorell, L. G. (1982). Spatial frequency selectivity of cells in macaque visual cortex. Vision Research, 22, 545–559. [PubMed] [CrossRef] [PubMed]
DiCarlo, J. J. Johnson, K. O. Hsiao, S. S. (1998). Structure of receptive fields in area 3b of primary somatosensory cortex in the alert monkey. Journal of Neuroscience, 18, 2626–2645. [PubMed] [PubMed]
Dong, D. W. Atick, J. J . (1996). Statistics of natural time-varying image. Network Computation in Neural Systems, 6, 345–358. [CrossRef]
Field, D. J. (1987). Relations between the statistics of natural images and the response properties of cortical cells. Journal of the Optical Society of America Am A, 4, 2379–2394. [PubMed] [CrossRef]
Gallant, J. L. Connor, C. E. van Essen, D. C. (1998). Neural activity in areas V1, V2 and V4 during free viewing of natural scenes compared to controlled viewing. Neuroreport, 9, 2153–2158. [PubMed] [CrossRef] [PubMed]
Golub, G. H. van Loan, C. F. (1989). Matrix computations. Baltimore: Johns Hopkins University Press.
Grosof, D. H. Hawken, M. J. Shapley, R. M. (1993). Macaque V1 neurons can signal ‘illusory’ contours. Nature, 365, 550–552. [PubMed] [CrossRef] [PubMed]
Haykin, S. (1991). Adaptive Filter Theory (2nd. ed.). Prentice-Hall, 2nd edition.
Hubel, D. H. Wiesel, T. N. (1968). Receptive fields and functional architecture of monkey striate cortex. Journal of Physiology (London), 195, 215–245. [PubMed] [CrossRef]
Johnson, E. N. Hawken, M. J. Shapley, R. M. (2001). The spatial transformation of color in the primary visual cortex of the macaque monkey. Nature Neuroscience, 4, 409–416. [PubMed]
Jones, J. P. Palmer, L. A. (1987). The two-dimensional spatial structure of simple receptive fields in the cat striate cortex. Journal of Neurophysiology, 58, 1187–1258. [PubMed] [PubMed]
Kapadia, M. K. Westheimer, G. Gilbert, C. D. (2000). Spatial distribution of contextual interactions in primary visual cortex and in visual perception. Journal of Neurophysiology, 84, 2048–2062. [PubMed] [PubMed]
Knierim, J. J. van Essen, D. C. (1992). Neuronal responses to static texture patterns in area V1 of the alert macaque monkey. Journal of Neurophysiology, 67, 961–980. [PubMed] [PubMed]
Lee, Y.W. Schetzen, M . (1965). Measurement of the Wiener kernels of a nonlinear system by cross-correlation. International Journal of Control, 2, 237–254. [CrossRef]
Levitt, J. B. Lund, J. S. (1997). Contrast dependence of contextual effects in primate visual cortex. Nature, 387, 73–76. [PubMed] [CrossRef] [PubMed]
Mareschal, I. Baker, C. L.Jr. (1999). Cortical processing of second-order motion. Visual Neuroscience, 16, 527–540. [PubMed] [CrossRef] [PubMed]
Marmarelis, P. N. Marmarelis, V. Z. (1978). Analysis of Physiological Systems: The White Noise Approach. New York: Plenum Press.
Morrone, M. C. Burr, D. C. Maffei, L. (1982). Functional implications of cross-orientation inhibition of cortical visual cells. I. Neurophysiological evidence. Proceedings of the Royal Society of London. Series B: Biological Sciences, 216, 335–354. [PubMed] [CrossRef]
Movshon, J. A. Thompson, I. D. Tolhurst, D. J. (1978a). Spatial summation in the receptive fields of simple cells in the cat’s striate cortex. Journal of Physiology (London), 283, 53–77. [PubMed] [CrossRef]
Movshon, J. A. Thompson, I. D. Tolhurst, D. J. (1978b). Receptive field organization of complex cells in the cat’s striate cortex. Journal of Physiology (London), 283, 79–99. [PubMed] [CrossRef]
Olshausen, B. A. Field, D. J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607–609. [PubMed] [CrossRef] [PubMed]
Olshausen, B. A. Field, D. J. (1997). Sparse coding with an overcomplete basis set: A strategy employed by {V1}. Vision Research, 37, 3311–3325. [PubMed] [CrossRef] [PubMed]
Parker, A. J. Hawken, M. J. (1988). Two-dimensional spatial structure of receptive fields in monkey striate cortex. Journal of the Optical Society of America A, 5, 598–605. [PubMed] [CrossRef]
Pratt, W. K. (1991). Digital Image Processing (2nd. ed.). New York: John Wiley & Sons.
Rieke, F. Warland, D. van Steveninck, R. Bialek, W. (1997). Spikes. Boston: Massachusetts Institute of Technology Press.
Ringach, D. L. Hawken, M. J. Shapley, R. (1997a). Dynamics of orientation tuning in macaque primary visual cortex. Nature, 387, 281–284. [PubMed] [CrossRef]
Ringach, D. L. Sapiro, G. Shapley, R. (1997b). A subspace reverse correlation technique for the study of visual neurons. Vision Research, 37, 2455–2464. [PubMed] [CrossRef]
Sceniak, M. P. Ringach, D. L. Hawken, M. J. Shapley, R. (1999). Contrast’s effect on spatial summation by V1 neurons. Nature Neuroscience, 2, 733–739. [PubMed] [CrossRef] [PubMed]
Sigman, M. Cecchi, G. A. Gilbert, C. D. Magnasco, M. O. (2001). On a common circle: Natural scenes and Gestalt rules. Proceedings of the National Academy of Sciences of the United States of America, 98, 1935–1940. [PubMed] [CrossRef] [PubMed]
Sillito, A. M. Grieve, K. L. Jones, H. E. Cudeiro, J. Davis, J. (1995). Visual cortical mechanisms detecting focal orientation discontinuities. Nature, 378, 439–440. [PubMed] [CrossRef] [PubMed]
Skottun, B. C. DeValois, R. L. Grosof, D. H. Movhson, J. A. Albercht, D. G. Bonds, A. B. (1991). Classifying simple and complex cells on the basis of response modulation. Vision Research, 31, 1079–1086. [PubMed] [PubMed]
Spitzer, H. Hochstein, S. (1985). A complex-cell receptive-field model. Journal of Neurophysiology, 53, 1266–1286. [PubMed] [PubMed]
Szulborski, R. G. Palmer, L. A. (1990). The two-dimensional spatial structure of nonlinear subunits in the receptive fields of complex cells. Vision Research, 30, 249–254. [PubMed] [CrossRef] [PubMed]
Theunissen, F. E. David, S. V. Singh, N. C. Hsu, A. Vinje, W. E. Gallant, J. L. (2001). Estimating spatio-temporal receptive fields of auditory and visual neurons from their responses to natural stimuli. Network, 12, 289–316. [PubMed] [CrossRef] [PubMed]
Tolhurst, D. J. Dean, A. F. (1990). The effects of contrast on the linearity of the spatial summation of simple cells in the cat’s striate cortex. Experimental Brain Research, 79, 582–588. [CrossRef] [PubMed]
Tolhurst, D. J. Heeger, D. J. (1997). Comparison of contrast-normalization and threshold models of the responses of simple cells in cat striate cortex. Visual Neuroscience, 14, 293–309. [PubMed] [CrossRef] [PubMed]
Tolhurst, D. J. Tadmor, Y. Chao, T. (1992). Amplitude spectra of natural images. Ophthalmic and Physiological Optics, 12, 229–232. [PubMed] [CrossRef] [PubMed]
van Hateren, J. H. (1987). Processing of natural time series of intensities by the visual system of the blowfly. Journal of the Optical Society of America A, 37, 3407–3416.
van Hateren, J. H. van der Schaaf, A. (1998). Independent component filters of natural images compared with simple cells in primary visual cortex. Proceedings of the Royal Society of London B, 265, 359–366. [PubMed] [CrossRef]
Victor, J. D. Pinter, R. Nabet, B. (1992). Nonlinear systems analysis in vision: Overview of kernel methods Nonlinear Vision: Determination of Neural Receptive Fields, Function and Networks, (Vol. 1, pp. 1–37). Cleveland: CRC Press.
Walker, G. A. Ohzawa, I. Freeman, R. D. (1999). Asymmetric suppression outside the classical receptive field of the visual cortex. Journal of Neuroscience, 19, 10536–10553. [PubMed] [PubMed]
Walker, G. A. Ohzawa, I. Freeman, R. D. (2000). Suppression outside the classical cortical receptive field. Visual Neuroscience, 17, 369–379. [PubMed] [CrossRef] [PubMed]
Zar, J. H. (1996). Biostatistical Analysis (3rd ed.). Englewood Cliffs, NJ: Prentice-Hall.
Zipser, K. Lamme, V. A. Schiller, P. H. (1996). Contextual modulation in primary visual cortex. Journal of Neuroscience, 16, 7376–7389. [PubMed] [PubMed]
Figure 1
 
Examples of feature maps obtained from the original frames in the movie. (a) Three still frames taken from the movie “Sleeper.” (b) The luminance-contrast maps associated with the original images. Regions in white indicate positive values of contrast, whereas regions in black indicate negative values. (c) The edge map associated with the original images. Locations where large gradients in the luminance contrast map are located are emphasized. (d) Oriented edge maps associated with the original images when θ = 0; this choice accentuates oriented boundaries that are near vertical. (e) Oriented edge maps associated with the original images when θ = π/2 this emphasizes oriented edges that are near horizontal.
Figure 1
 
Examples of feature maps obtained from the original frames in the movie. (a) Three still frames taken from the movie “Sleeper.” (b) The luminance-contrast maps associated with the original images. Regions in white indicate positive values of contrast, whereas regions in black indicate negative values. (c) The edge map associated with the original images. Locations where large gradients in the luminance contrast map are located are emphasized. (d) Oriented edge maps associated with the original images when θ = 0; this choice accentuates oriented boundaries that are near vertical. (e) Oriented edge maps associated with the original images when θ = π/2 this emphasizes oriented edges that are near horizontal.
Figure 2
 
Performance of method on a simulated simple cell. The system consists of a cascade of a spatial linear filter (acting on the luminance contrast of the input) followed by a hard-step threshold nonlinearity. The signal z(n) represents Gaussian additive noise. The simulated receptive field had two subfields, one excitatory (in red) and one inhibitory (in blue). The algorithm had to estimate this receptive field given the input image sequence and the response of the cell, r(n). The result of applying the method is shown below the image of the simulated receptive field.
Figure 2
 
Performance of method on a simulated simple cell. The system consists of a cascade of a spatial linear filter (acting on the luminance contrast of the input) followed by a hard-step threshold nonlinearity. The signal z(n) represents Gaussian additive noise. The simulated receptive field had two subfields, one excitatory (in red) and one inhibitory (in blue). The algorithm had to estimate this receptive field given the input image sequence and the response of the cell, r(n). The result of applying the method is shown below the image of the simulated receptive field.
Figure 3
 
Analysis of receptive field structure using natural image stimulation. Each panel in this figure shows the estimated luminance contrast kernel (on the left) and the edge kernel (on the right) for several V1 cells. Additional information is displayed on top of each panel: the cell’s laminar location, the ratio between the first-harmonic component and the mean of the response (F1/F0) for the optimal sinusoidal grating stimulus followed by the classification of the cell as simple (F1/F0 > 1) or complex (F1/F0 ≤ 1) the angular size represented by one side of the 17 × 17 grid, and the preferred orientation of the cell as measured with conventional drifting gratings. The orientation of the bar corresponds to the orientation of the grating that generated the best response. In (l) the bar was omitted because the cell was not well tuned. In most cases we observe that the preferred orientation of the cell closely matches the axis of elongation of the estimated receptive fields. Each kernel was normalized independently so that its maximum absolute value was one. This makes optimal use of the pseudo-color map which ranges from −1 (blue) to +1 (red).
Figure 3
 
Analysis of receptive field structure using natural image stimulation. Each panel in this figure shows the estimated luminance contrast kernel (on the left) and the edge kernel (on the right) for several V1 cells. Additional information is displayed on top of each panel: the cell’s laminar location, the ratio between the first-harmonic component and the mean of the response (F1/F0) for the optimal sinusoidal grating stimulus followed by the classification of the cell as simple (F1/F0 > 1) or complex (F1/F0 ≤ 1) the angular size represented by one side of the 17 × 17 grid, and the preferred orientation of the cell as measured with conventional drifting gratings. The orientation of the bar corresponds to the orientation of the grating that generated the best response. In (l) the bar was omitted because the cell was not well tuned. In most cases we observe that the preferred orientation of the cell closely matches the axis of elongation of the estimated receptive fields. Each kernel was normalized independently so that its maximum absolute value was one. This makes optimal use of the pseudo-color map which ranges from −1 (blue) to +1 (red).
Figure 4
 
(a,b) Comparison of receptive fields mapped with natural image sequences (right panel) and with subspace reverse correlation (left panel) for two V1 cells. (c) Scatter plot of the cell preferred orientation as measured with steady-state drifting gratings (x-axis) versus the angle of elongation of the strongest subfield in the kernels for the cells in Figure 3 (a–k). Open squares represent the elongation of subfields in the luminance contrast kernel and open circles represent the elongation for the edge map kernels. Cases where the ‘aspect ratio’ of the subfield defined by the ratio between the largest and smallest eigenvalues of the (centered) second order moment matrix was less than 1.2 were ignored. A small aspect ratio could result because of noise in the kernels (such as Figure 3i, left panel) or because the subfield was round (such as Figure 3l, right panel).
Figure 4
 
(a,b) Comparison of receptive fields mapped with natural image sequences (right panel) and with subspace reverse correlation (left panel) for two V1 cells. (c) Scatter plot of the cell preferred orientation as measured with steady-state drifting gratings (x-axis) versus the angle of elongation of the strongest subfield in the kernels for the cells in Figure 3 (a–k). Open squares represent the elongation of subfields in the luminance contrast kernel and open circles represent the elongation for the edge map kernels. Cases where the ‘aspect ratio’ of the subfield defined by the ratio between the largest and smallest eigenvalues of the (centered) second order moment matrix was less than 1.2 were ignored. A small aspect ratio could result because of noise in the kernels (such as Figure 3i, left panel) or because the subfield was round (such as Figure 3l, right panel).
Figure 5
 
Evaluating the statistical significance of kernel features. The figure shows the z-transformed values of some of the kernels depicted in Figure 3. All features described in the text had peak absolute z values larger than 4.
Figure 5
 
Evaluating the statistical significance of kernel features. The figure shows the z-transformed values of some of the kernels depicted in Figure 3. All features described in the text had peak absolute z values larger than 4.
Figure 6
 
The use of oriented edge maps to analyze the contribution of different orientations to the cell’s response. Each panel in this figure shows four kernels. In left to right order they represent: the luminance contrast kernel, the edge kernel, the oriented-edge kernel when the angle θ was selected to emphasize edges with orientations similar to the preferred orientation of cell, and the oriented-edge kernel when the angle θ was selected to accentuate boundaries at the orthogonal orientation. These are all complex cells, and have weak structure in their luminance contrast kernels.
Figure 6
 
The use of oriented edge maps to analyze the contribution of different orientations to the cell’s response. Each panel in this figure shows four kernels. In left to right order they represent: the luminance contrast kernel, the edge kernel, the oriented-edge kernel when the angle θ was selected to emphasize edges with orientations similar to the preferred orientation of cell, and the oriented-edge kernel when the angle θ was selected to accentuate boundaries at the orthogonal orientation. These are all complex cells, and have weak structure in their luminance contrast kernels.
Figure 7
 
Pseudo-code of a modified RLS algorithm used to compute the optimal linear kernels in this study.
Figure 7
 
Pseudo-code of a modified RLS algorithm used to compute the optimal linear kernels in this study.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×