Abstract
Although extensive research investigates the encoding of both two-dimensional motion and binocular disparity, relatively little is known about how the human visual system combines these cues to infer three-dimensional motion. 3D-motion stimuli produce two different signals on the two retinae. Percepts of 3D-motion may depend on velocity-based cues (velocity difference of the two retinal motions), or on the corresponding disparity-based cues (change of binocular disparity over time).
We performed a series of human fMRI experiments (3T BOLD 2-shot spiral fMRI, (2.2 mm)3 voxels, 3 s TR) to isolate these velocity and disparity signals in the visual system, and to link them to percepts of 3D-motion. Subjects viewed displays via a mirror stereoscope, and performed a task to control attention. We selectively studied the disparity-based cue with random dot dynamic stereograms that did not contain systematic retinal velocity signals, and studied the velocity-based cue by parametrically varying the proportion of anticorrelated dots (which have opposite contrast in the two eyes). We also parametrically varied the orientation of the dot element motions from horizontal (yielding strong 3D-motion percepts) to vertical (no percepts of 3D-motion).
We observed strong responses in human MT+ during the presentation of motion through depth. This contrasts with observations of motion opponency in this area, given that 3D-motion displays include oppositely-moving dots in corresponding parts of the visual field. Much like the percepts, MT+ responses were invariant to anticorrelation level for 3D-motion displays, but decreased with anticorrelation for laterally-moving control stimuli.
However, MT+ responses were invariant to orientation, suggesting that net activity in this region does not straightforwardly track the strength of 3D-motion percepts. Instead, we noted an area in the posterior parietal lobe that appeared selectively responsive to stimuli that yielded a percept of 3D-motion. These results suggest that the processing of realistic 3D-motion requires more than MT+.
Supported by UT Austin Imaging Research Center and NWO Grant 2006/11353/ALW.