Abstract
Humans can optimally integrate sensory cues across different perceptual modalities in order to form a coherent percept (Ernst/Banks 2002). Here, we propose that optimal integration also occurs within a single perceptual modality. More specifically, we hypothesize that the perceived retinal speed of a translating intensity pattern results from a Bayesian integration of sensory signals across independent spatiotemporal frequency channels, combined with a prior expectation for slow speeds (Stocker/Simoncelli 2006).
In order to validate our hypothesis, we had four subjects perform a 2AFC visual speed discrimination task. The reference stimulus was a broad-band compound grating drifting at a speed of 2 deg/s. Test stimuli were either drifting sinewave gratings of one of two spatial frequencies (0.4 or 1.2 cycles/deg, respectively; at 30% contrast), or their superposition in either a “peaks-add” or a “peaks-subtract” phase configuration. We measured full psychometric curves for all conditions using an adaptive staircase procedure. We found that all subjects perceived the single spatial frequency gratings moving slower than their superpositions, with the low frequency grating being perceived the slowest. However, we did not find any significant difference in perceived speeds between the combined grating stimuli in “peaks-add” and “peaks-subtract” configuration, despite the fact that the effective contrast of both configurations differs by 30%.
The measured perceived speeds are consistent with the predictions of a Bayesian observer model that optimally integrates sensory signals of two independent spatiotemporal frequency channels each responding to one of the two grating stimuli, in combination with a prior for slow speeds. Fits of the observer model to individual subjects' data well account for the full set of psychometric functions. The estimated parameters for prior and likelihoods are consistent between subjects and comparable with values suggested in previous studies. Our results potentially lead to improved models of coherent motion perception of more complex stimuli.