In macaque visual cortex, the conventional view is that image motion is initially detected by direction-sensitive neurons that are tuned in terms of local spatial and temporal frequency (TF), from which speed is encoded later. We used functional magnetic resonance imaging (fMRI) adaptation to seek evidence for speed or TF tuning in human visual cortex. Drifting sine-wave gratings were presented in pairs (S1: adapter, 100% contrast; S2: probe, 15, 40 or 80% contrast). In each trial, either speed or TF was the same for S1 and S2, whereas the other dimension changed. We investigated whether the response was weaker (adapted) for repetitions of the same speed, indicating speed coding, or for repetitions of TF, indicating TF coding. For high-contrast (80%) probes, we observed clear speed coding in MT and MST with similar but weaker trends in several earlier visual areas. For medium- and low contrast probes, our data indicated a trend towards temporal frequency coding in most visual areas studied. In a second experiment, we adjusted stimuli in terms of perceived rather than physical speed and found a trend for speed coding even for low-contrast probes. Our results suggest that speed coding dominates in MT/MST for high contrast stimuli, and possibly also in other visual areas and/or at lower contrasts.

*t*-tests revealed a significant difference from 0° orientation. Priebe, Cassanello, and Lisberger (2003) classified neurons as speed tuned if their 95% confidence intervals included a slope value of 1, as frequency-tuned if it included 0, and unclassified otherwise. Based on this analysis, they reported that only 25% of MT neurons show speed tuning. A similar number were classed as temporal-frequency tuned and the remainder had intermediate properties. Stimulation with gratings consisting of two components of different spatial frequencies increased the proportion of MT neurons showing speed encoding, suggesting that speed coding may be more in evidence in MT for natural stimuli than for sine gratings. Comparable results have been reported for marmoset MT (Lui, Bourne, & Rosa, 2007).

*C*is contrast,

*L*

_{max}and

*L*

_{min}are the maximum and minimum luminances in the image.

^{2}.

S2 (probe) | ||||
---|---|---|---|---|

A | B | |||

2 cycles/deg 4 Hz 2 deg/sec | 2 cycles/deg 8 Hz 4 deg/sec | |||

S1 (adapter) | A | 1 cycle/deg 4 Hz 4 deg/sec | same TF | same speed |

B | 4 cycles/deg 8 Hz 2 deg/sec | same speed | same TF |

*t*= 1.25 (Boynton, Engel, Glover, & Heeger, 1996). The resulting reference time-courses were used to fit the time course of each voxel by means of a general linear model, separately for each participant. Parameters from 3D motion correction (translation and rotation) were included in the model as regressors of no interest, to increase power. Only those voxels that were revealed by the full model, surviving a statistical threshold of a false discovery rate of

*p*< 0.001, were included for computing event-related averages.

*AI*< = 1 if adaptation is stronger for ‘same speed’ trials than for ‘same TF’ trials, indicating speed encoding, whereas −1 < =

*AI*< 0 indicates that adaptation is stronger for ‘same TF’ than for ‘same speed’ (i.e., TF encoding). The adaptation index was computed separately for each ROI and for each stimulus contrast condition. Next, we computed 95% confidence intervals using a nonparametric bootstrapping procedure (Efron & Tibshirani, 1993). In brief, we created 10,000 samples by randomly sampling hemispheres with replacement to estimate the empirical variance of the data using the MATLAB function ‘bootstrp’ contained in the statistics toolbox. The resulting values were used to compute the 95% confidence intervals. A statistically significant adaptation index was assumed if the 95% confidence interval derived from this procedure did not include 0.

^{2}). Since a single trial lasted 17 seconds, a run would last a minimum of 49 × 17 seconds (13.9 minutes) under these conditions. Initial pilot sessions showed that this was too demanding for participants, resulting in occasional head movements even in well-practiced participants. Therefore, each series of 49 trials was divided into three shorter runs of 17, 16 and 16 trials. Within each run, an additional trial was added at the beginning, which was later discarded from the analysis. This ensured that the first analyzed trial had a history that fitted the counterbalancing pattern. Furthermore, 34 seconds were added at the beginning of each run and 10 seconds at the end of each run to allow the hemodynamic response to stabilize. Thus, one scan run lasted either 350 or 333 seconds.

S2 (probe, 15% contrast) | ||||||
---|---|---|---|---|---|---|

A | B | |||||

Physical | Perceived | Physical | Perceived | |||

2 cycles/deg 8 Hz 4 deg/sec | 2 cycles/deg 8 Hz 4 deg/sec | 2 cycles/deg 6 Hz 3 deg/sec | 2 cycles/deg 4 Hz 2 deg/sec | |||

S1 (adapter, 100% contrast) | A | 1 cycle/deg 4 Hz 4 deg/sec | same speed | same TF | ||

B | 4 cycles/deg 8 Hz 2 deg/sec | same TF | same speed |