**A large body of research has established that, under relatively simple task conditions, human observers integrate uncertain sensory information with learned prior knowledge in an approximately Bayes-optimal manner. However, in many natural tasks, observers must perform this sensory-plus-prior integration when the underlying generative model of the environment consists of multiple causes. Here we ask if the Bayes-optimal integration seen with simple tasks also applies to such natural tasks when the generative model is more complex, or whether observers rely instead on a less efficient set of heuristics that approximate ideal performance. Participants localized a “hidden” target whose position on a touch screen was sampled from a location-contingent bimodal generative model with different variances around each mode. Over repeated exposure to this task, participants learned the a priori locations of the target (i.e., the bimodal generative model), and integrated this learned knowledge with uncertain sensory information on a trial-by-trial basis in a manner consistent with the predictions of Bayes-optimal behavior. In particular, participants rapidly learned the locations of the two modes of the generative model, but the relative variances of the modes were learned much more slowly. Taken together, our results suggest that human performance in a more complex localization task, which requires the integration of sensory information with learned knowledge of a bimodal generative model, is consistent with the predictions of Bayes-optimal behavior, but involves a much longer time-course than in simpler tasks.**

**Figure 1**

**Figure 1**

*SD*of 10 pixels), medium variance (

*SD*of 60 pixels), or high variance (

*SD*of 100 pixels; see inset of Figure 1A for an illustration of the three levels of variance). After participants provided their response by touching a location on the display, feedback on their accuracy on that trial was provided by displaying a new dot at the touched location and a second new dot at the true target location. In addition, participants received feedback in the form of numerical points, with the magnitude of the points varying based on their accuracy.

*μ*and , and the mean and the variance of the underlying target distribution for that trial is given by

_{l}*μ*and , then the target location

_{p}*t̂*predicted by this model would be (Bernardo & Smith, 1994; Cox, 1946; Jacobs, 1999; Yuille & Bülthoff, 1996): where

*w*, the weight assigned by the observer to the sensory information, should be:

_{l}In this study, you will be playing a game on the iPad. The story behind this game is that you are at a fair and there is an invisible bucket that you are trying to locate. Sometimes the bucket is going to be located on the left side of the display and at other times the bucket is going to be located on the right side of the display. Now, given that the bucket is invisible, you can't see where it is. However, on each trial you will see some locations that other people have previously guessed the bucket is located at. These “guesses” will show up as white or green dots on the screen. Now, it is important to note that you don't know which (if any) of the dots actually correspond to the location of the bucket. Indeed, all of the dots could be just random guesses. Your job on each trial is to try to figure out where the bucket is actually located. Once you decide on a location, you can just touch it, at which point you will see two more dots: a red dot, which shows you the true location of the bucket on that trial, and a blue dot, which shows you the location that you touched. If the blue dot is right on the red dot, then you correctly guessed the location of the bucket and you will get 20 points. If you don't exactly guess the location of the bucket but you still get close, then you will get 5 points. When you see this, it means that you need to try just a little harder to get the right location—you are close. Finally, if your guess is very far away, you will get no points.

*w*) assigned to the centroid of the cluster of dots (the likelihood), with the weight assigned to the mean of the underlying target distribution for that trial (the prior) being defined as (1 −

_{l}*w*). Thus, on each trial, given the centroid of the sensory information (

_{l}*μ*), the mean of the underlying target distribution (

_{l}*μ*) and participants' estimate for the target location (

_{p}*t̂*), the weight assigned by participants to the centroid of the sensory information (

*w*) was estimated using:

_{l}*p*s > 0.05).

*F*(2, 14) = 18.89,

*p*< 0.001.

**Figure 2**

**Figure 2**

*F*(3, 21) = 4.00,

*p*= 0.02, a main effect of likelihood variance,

*F*(2, 14) = 21.17,

*p*< 0.0001, and an interaction between the two factors,

*F*(6, 42) = 5.22,

*p*< 0.001. Similarly, in a 4 (temporal bin) × 3 (likelihood variance) repeated measures ANOVA carried out over participants' weights in the narrow prior condition, there was a main effect of temporal bin,

*F*(3, 21) = 9.06,

*p*< 0.001, a main effect of likelihood variance,

*F*(2, 14) = 12.24,

*p*< 0.001, and an interaction between the two factors,

*F*(6, 42) = 5.47,

*p*< 0.001. These findings are inconsistent with Model 1, which predicts a weight of 1 to the sensory information across all the likelihood and prior conditions and no change in participants' weights as a function of exposure to the task. Furthermore, in contrast to the predictions of Model 2, we found that the drop in the weight assigned by participants to the likelihood as a function of exposure to the task, particularly in the high likelihood variance condition, was greater for trials in which the target was drawn from the narrow prior, than when the target was drawn from the broad prior. In a 4 (temporal bin) × 2 (prior variance) repeated measures ANOVA carried out over participants' weights in the high likelihood variance condition, we saw a main effect of temporal bin,

*F*(3, 21) = 8.29,

*p*< 0.001, and an interaction between temporal bin and prior variance,

*F*(3, 21) = 3.07,

*p*= 0.05. This finding suggests that participants are sensitive to, and learn, both the means and the relative variances of the underlying prior distributions. Taken together, this pattern of results provides clear evidence in support of Model 3—the hypothesis that human observers learn the complex generative model and use this learned knowledge in a manner that is consistent with the predictions of Bayes-optimal behavior.

**Figure 3**

**Figure 3**

There will also be some trials in which you will be the first person to guess the location of the bucket. In these trials, rather than seeing dots on the screen, you will see a briefly flashed white or green rectangle, which will indicate the side of the screen that the invisible bucket is located at—the bucket could be located anywhere inside that rectangle. Again, your job is to try to figure out where the bucket is actually located.

*w*) assigned to the centroid of the cluster of dots (the likelihood), with the weights assigned to the mean of the underlying location-contingent target distribution for that trial (the prior) being defined as (1 −

_{l}*w*). For the trials in which no sensory information was available, we computed participants' mean responses, across all trials in each temporal bin. As in Experiment 1, we again focused on performance in the vertical dimension.

_{l}*F*(2, 14) = 45.84,

*p*< 0.0001, and an interaction between the two factors,

*F*(2, 14) = 3.83,

*p*= 0.047. Furthermore, for each prior condition, we again found an interaction between exposure and likelihood variance. Specifically, in a 4 (temporal bin) × 3 (likelihood variance) repeated measures ANOVA carried out over participants' weights in the broad prior condition, there was a main effect of temporal bin,

*F*(3, 21) = 7.49,

*p*= 0.001, a main effect of likelihood variance,

*F*(2, 14) = 34.21,

*p*< 0.0001, and an interaction between the two factors,

*F*(6, 42) = 2.84,

*p*= 0.02. Similarly, in a 4 (temporal bin) × 3 (likelihood variance) repeated measures ANOVA carried out over participants' weights in the narrow prior condition, there was a marginal main effect of temporal bin,

*F*(3, 21) = 2.87,

*p*= 0.06, a main effect of likelihood variance,

*F*(2, 14) = 17.72,

*p*< 0.001, and a marginal interaction between the two factors,

*F*(6, 42) = 2.30,

*p*= 0.052. Finally, participants assigned a smaller weight to the likelihood, particularly in the high likelihood variance condition, when the target was drawn from the narrow prior, than when the target was drawn from the broad prior. In a 4 (temporal bin) × 2 (prior variance) repeated measures ANOVA carried out over participants' weights in the high likelihood variance condition, there was a main effect of temporal bin,

*F*(3, 21) = 5.55,

*p*< 0.001, and a main effect of prior variance,

*F*(1, 7) = 7.89,

*p*= 0.026. These findings therefore represent a replication of our results from Experiment 1. To further confirm that these results replicate the results from Experiment 1, we carried out a 2 (experiment) × 4 (temporal bin) × 3 (likelihood variance) mixed ANOVA, with experiment as a between-participants factor and temporal bin and likelihood variance as within-participant factors, over participants' weights in both the broad and narrow prior conditions. In each case, we found no interaction between experiment and any other factor (all

*p*s > 0.17).

**Figure 4**

**Figure 4**

*t*

_{7}= 1.73,

*p*= 0.13; narrow prior:

*t*

_{7}= 1.24,

*p*= 0.25). Participants were therefore able to rapidly learn the prior means even when presented with a complex generative model (Figure 5), extending results from previous work examining such behavior in the presence of simpler generative models. It is important to note, however, that in the conditions where sensory information was available, which were randomly interleaved with the prior-only conditions, participants' weights continue to change throughout the experiment (Figure 4). This pattern of results suggests that, while participants learn the prior means very rapidly, it takes much more exposure to learn the relative variances of the two prior distributions. Moreover, the finding that participants learn the true prior means within the first temporal bin, but their weights continue to change throughout the experiment, further contradicts the predictions of Model 2. If participants were only learning and using the prior means, and not the relative variances of the prior distributions (as predicted by Model 2), then we would expect to see no change in participants' weights beyond the first temporal bin, since they show no change in their knowledge of the prior means beyond this bin (as determined by performance in the no-sensory information condition).

**Figure 5**

**Figure 5**

*internal*estimates of the prior and likelihood distributions, thereby ensuring that the ideal observer has access to the same quality of information as our participants. Since participants were presented with multiple samples from the likelihood distribution (i.e., the cloud of dots) on every trial, the uncertainty implicit in it is computable by the participant on a trial-by-trial basis (see Sato & Kording, 2014, for a scenario in which this was not true). It is therefore reasonable to approximate participants' internal estimate of likelihood uncertainty by the true uncertainty implicit in the distributions used to generate the cloud of dots on each trial (i.e., the standard deviation of the dot-cloud distribution divided by the square root of the number of dots).

*, 7, 1057–1058, doi:10.1038/nn1312.*

*Nature Neuroscience**, 6 (5), e19812, doi:10.1371/journal.pone.0019812.*

*PLoS ONE**. New York: Wiley.*

*Bayesian theory**, 5 (9), e12686, doi:10.1371/journal.pone.0012686.*

*PLoS ONE**, 14 (1), 1–13, doi:10.1119/1.1990764.*

*American Journal of Physics**, 98, 3034–3046, doi:10.1152/jn.00858.2007.*

*Journal of Neurophysiology**, 129, 220–241, doi:10.1037/0096-3445.129.2.220.*

*Journal of Experimental Psychology: General**, 39, 3621–3629.*

*Vision Research**, 13, 1020–1026, doi:10.1038/nn.2590.*

*Nature Neuroscience**, 13 (2), e1002075, doi:10.1371/journal.pbio.1002075.*

*PLoS Biology**, 27, 712–719.*

*Trends in Neurosciences**, 2(9), e943, doi:10.1371/journal.pone.0000943.*

*PLoS ONE**, 427 (6971), 244–247, doi:10.1038/nature02169.*

*Nature**, 110 (11), E1064–E1073, doi:10.1073/pnas.1214869110.*

*Proceedings of the National Academy of Sciences, USA**, 94, 395–399, doi:10.1152/jn.01168.2004.*

*Journal of Neurophysiology**, 11 (9), e1001662, doi:10.1371/journal.pbio.1001662.*

*PLoS Biology**, 13 (2), e1002073, doi:10.1371/journal.pbio.1002073.*

*PLoS Biology**, 14, 425–432.*

*Trends in Cognitive Sciences**, 9, 578–585.*

*Nature Neuroscience**, 26, 10154–10163, doi:10.1523/JNEUROSCI.2779-06.2006.*

*The Journal of Neuroscience**, 22, 1641–1648, doi:10.1016/j.cub.2012.07.010.*

*Current Biology**, 5, 598–604.*

*Nature Neuroscience**(pp. 123–161). Cambridge, UK: Cambridge University Press.*

*Perception as Bayesian inference*