We use the same model as that described in Körding et al. (
2007).
Figure 2 shows the statistical structure of the Bayesian observer model. The most important feature of this model is that it does not assume integration
a priori. Instead it assumes that the sensory signals
xV and
xA are caused by either a single source
s (
Figure 2 left) or by two separate sources,
sA and
sV (
Figure 2 right).
xV and
xA represent the visual and auditory signals, respectively, and are assumed to be conditionally independent, based on the observation that the auditory and visual signals are processed in separate pathways and are likely corrupted by independent noise.
Presented with the signals
x V and
x A the Bayesian observer therefore has to estimate whether the two signals originate from a common cause (
C = 1) or from two separate causes (
C = 2). How likely each scenario is depends on how similar the auditory and visual sensations (
x V and
x A) are. According to Bayes' rule, the probability of there being a single cause is:
where
p c= denotes the prior probability of a single cause in the environment and
p(
x V,
x A∣
C = 1) and
p(
x V,
x A∣
C = 2) can be found by marginalizing over
s A and
s V (see Körding et al.,
2007). Given this knowledge, the optimal solution for the location that minimizes the mean expected squared error is:
where
VorA is the visual or audio response,
C=1 is the optimal estimate if we were certain that there is a single cause, and
V,C=2,
A,C=2 are visual and auditory uni-modal estimates, respectively, if we were certain that the two stimuli are independent (two causes). We assume that the unimodal likelihoods,
p(
xV∣
sV),
p(
xA∣
sA), as well as the prior probability distribution over locations (assuming
p(
s) =
p(
sV) =
p(
sA)), are normally distributed with means and variances (
μA,
σA2), (
μV,
σV2), and (
μP,
σP2), respectively. Thus:
and
C is binomially distributed with P( C = 1) = p C We assume that the mean of the likelihoods are at the veridical locations and that mean of the prior distribution over locations is at the fixation point, 0 deg. In order to relate the theoretical posterior with the subjects' responses we assume that subjects try to limit their mean deviation and therefore report the mean of their posterior. The four free parameters ( σ A, σ V, σ P, p C) were fitted to the participants' responses using 10000 trials of Monte Carlo simulation and MATLAB's fminsearch function (Mathworks, 2006), maximizing the likelihood of the parameters of the model.
The posterior can be rewritten in a more familiar form (Shams et al.,
2005):
where
This is a mixture model (see Körding et al.,
2007 for more details), mixing the prior from the two separate causal structures in
Figure 2 and is therefore very similar to models developed for mixture problems for the visual system (Knill,
2003,
2007; Landy, Maloney, Johnston, & Young,
1995).
As in Körding et al. (
2007) and Stocker and Simoncelli (
2006), we model the trial to trial variability in observer responses as opposed to average behavior. We assume that the variability in response for the same stimulus condition from trial to trial is primarily due to the noise in measurement (neuronal firing). Because of the variability in measurement, the mean of likelihood function fluctuates from trial to trial, but the variance is constant (here it is assumed that the nervous system has an accurate estimation of this variability). The average of the means of the likelihood distribution is assumed to be at the veridical position, i.e., no bias in measurement. The variability in the likelihood function leads to variability in posterior and the variability in the estimate (which is the mean of the posterior) from trial to trial.