Abstract
What are the limits of sensory fusion in cue conflict stimuli? We recorded perceptual reports and RTs for discrepant audio and visual cues. A shape subtending approximately 10 deg was displayed twice during 83 ms separated by 333 ms on a monitor. The size of the shape changed between the two occurrences so as to simulate a displacement in depth. Simultaneous with the visual displays, auditory white noise was played through headphones with varying loudness also simulating a distance change (inverse square law). Participants reported whether the target was approaching or receding. We recorded unimodal and bimodal performance. In bimodal trials the 2 cues were either congruent (simulating the same change in depth), or opposite creating a strong conflict between cues. From block to block observers were instructed to report the direction of either the visual signal or the auditory signal. We found that in visual blocks perceptual reports were similar in the unimodal, congruent and conflict conditions, as were response times. Thus responses appear to have been mediated by the visual signal alone. In sharp contrast, in auditory blocks perceptual decisions were more precise in the congruent condition than the unimodal condition. But perceptual decisions were very poor in the conflict condition, and response times were longer. Therefore observers could disregard the auditory signal when paying attention to the visual signal, but were unable to suppress the visual signal when paying attention to the auditory signal. In a separate experiment observers were unaware of the cue conflict and fused the cues optimally. A mixture model (Knill, 2003; Körding et al., 2007), where the probability to fuse the cues decrease with increasing conflict captured the pattern of results only if different fusion functions are used in the different tasks, underlying the strong contribution of task in sensory fusion.
Meeting abstract presented at VSS 2017