In this section we explore the interaction between the motion energy model of V1 directionally selective (DS) neurons (Adelson & Bergen,
1985) and the stimuli used in
Experiment 2. To explore how the motion energy varied across all the orientations, speed and SF's present in our stimulus it was helpful to define a V1 neuron's speed tuning at the ratio between the (peak) temporal and spatial tuning of the neuron (
Equation A2,
1). The full battery of DS filters could then be defined by their direction, speed and spatial-frequency tuning as follows:
-
Thirty-two directions evenly spaced around the clock.
-
Thirteen evenly spaced speeds from 0% (static) to 150% of the carrier signal speed (3.95 deg/s).
-
Eight SFs from 50% to 700% of the peak SF of the broadband carrier signal (0.75 c/deg) in eight half-octave steps.
This resulted in the temporal frequency tuning of the DS filters varying across each SF channel (see
Figure 9). The spatial frequency and directional bandwidth of all the model neurons was held constant at 1.5 octaves and 45° (half width and full height) respectively in keeping with the observed bandwidths of primate area V1 (De Valois, Yund, & Hepler,
1982; Snowden, Treue, & Andersen,
1992).
The stimuli were accurate reconstructions of trials used in
Experiment 2 in terms of the aperture positions and the spatial (256 * 256) and temporal resolution (26 frames). However, to avoid the artifacts introduced by the horizontal/vertical pixel raster, the direction of motion on each trial was randomized.
Convolution of the signal and sensor took place in the Fourier domain and was inverse-transformed back into the spatial domain. The square root of the sum of the square of the real and imaginary components was taken to represent the motion energy at each point in space for each DS filter, a computation that is formally equivalent to the full rectified square of odd and even phase neurons to generate a phase invariant output (Adelson & Bergen,
1985). A global motion analysis was achieved by collapsing the spatial domain and summing across all DS filters tuned to the same spatiotemporal frequency and direction. Each spatial frequency channel could then be represented as a 2D speed ‘vs.’ direction image (as illustrated in
Figure 9a), in which the intensity of each region represents the global sum of motion energy across DS filters whose velocity tuning is denoted by the regions position in the image. The only filter normalization employed was to divide the output of each neuron by the sum of the absolute of the receptive field across space and time; this had the effect of evening out the expected 1/f spatiotemporal frequency spectrum. No gain control, normalization or inhibition occurred between neurons.
Noise and the sampling rate of neurons were not considered essential to the model output because discrimination thresholds were not derived from the output of the neurons. Additional factor such as the addition of Poisson noise (e.g. Dakin et al.,
2005) would have been necessary if direction discrimination thresholds were to be predicted. Further additional complexity could have been added by varying the bandwidths of the V1 neurons as a function of spatial or temporal frequency as both the physiology (e.g. Bair & Movshon,
2004) or psychophysics (e.g. Burr,
1981) would deem necessary, but this would make the resulting motion energy more complex to analyze. For instance it would be more difficult to ascertain whether the directional bandwidth of the signal was the result of the stimulus or the sensor. By keeping the bandwidth of the sensor constant (in octaves) in the SF domain and constant across the speed tuning of the sensor, the changes in signal bandwidth across these dimensions could be attributed to the stimulus, not the sensor.