Abstract
In a multisensory task, the extent to which unisensory signals are combined depends not only on the saliency of each cue but also on the likelihood that these signals are caused by a common sensory event. This causal inference problem can be solved if these signals share a common feature. We therefore investigated whether the performance in an audio-visual redundant signal paradigm would be influenced by a temporal binding feature. In this paradigm, participants were simultaneously presented a random dot motion and a tone cloud. The random dot motion consisted of 150 white dots displayed on a dark background in a 5.0dva circular aperture, and randomly repositioned every 83ms. The tone cloud corresponded to a train of 5 simultaneous short pure tones (83ms) uniformly drawn from a range of 3 octaves (330 to 2640Hz) with a resolution of 12 semitones per octave. Participants were asked to detect rising or descending patterns appearing in the auditory or visual modalities (unisensory changes) or redundant changes in both modalities (bimodal changes). Critically, dots contrast and tone level were modulated by a temporal envelope that was irrelevant to the detection task. These envelopes could be identical or different between the two modalities, therefore allowing us to modulate the extent to which the unisensory cues reflected a common multisensory object. Compared to unisensory changes, bimodal changes elicited faster responses and violated the theoretical Miller’s bound. Interestingly, response times for unisensory changes were more precise when the two modalities had the same temporal envelope, compared to different temporal envelopes. This suggests that, when auditory and visual modalities are temporally bound, a multisensory object is formed, and all features of this object are enhanced. Extracting the temporal coherence of the modalities might therefore help solving the correspondence problem and binding information across the senses.