Abstract
For locomotion through the visual world in which we operate, a core issue is to conceptualize how its 3D structure is encoded through the neural computation of the multiple depth cues and their integration to a unitary representation of the changing scene. The overall percept must be derived from the combination of all available depth cues, but a simple linear summation rule across the array of different depth cues, would massively overestimate the perceived depth in the scene in cases where each cue alone provides a close-to-veridical depth estimate. On the other hand, a Bayesian averaging, or ‘modified weak fusion’, model for depth cue combination does not provide for the observed enhancement of perceived depth from weak depth cues. To assess performance with multisensory cues, the case of perceived heading from motion in depth conveyed by visual and vestibular cues is considered, based on data kindly provided from a published behavioral study with monkeys (Dokka, DeAngelis & Angelaki, 2015, J Neurosci), where the perceived heading was biased by an auxiliary cue of the lateral motion of a visually-defined sphere, with a juice reward for a correct response to whether the physical heading was to the left or right. Theoretical distributions for Bayesian analyses were derived from the individual trial data. The individual heading estimates were substantially biased by the decoy sphere motion for each of the visual and vestibular cues alone, but much less so when the cues were combined. An Obligate Bayesian rule for the cue combination calculated for these data could not account for the observed reduction in the combined heading bias. The data are, however, consistent with a Selective Bayesian rule, in which the monkeys are assumed to be able to ignore the heading information in the visual modality if advantageous to do so.