Abstract
Ensemble perception is a perceptual integration process that computes the average feature of an array of visual stimuli. Recent evidence shows that the observer weights each of the elements in a display unequally when computing the mean: higher and lower weights are assigned to the elements with features close to (inlying elements) and further away (outlying elements) from the mean, respectively. The non-uniform weighting process, named robust averaging, has been taken as evidence against the optimal Bayesian behavior in perception. Here we show that robust averaging can be predicted by a Bayesian observer model constrained by efficient coding that assumes optimized sensory representation with respect to the stimulus statistics learned over the experiment. To test our model, we fitted the data from Li et al. (2017) in which subjects discriminated the average feature (orientation) of eight elements displayed in a circle relative to a reference element. Element features were sampled from Gaussian distributions with varying means and variances. Our model captured the key aspects of their reported data. Specifically, we predicted 1) higher weights for inlying than outlying elements, 2) overall higher weights in the condition with fixed than varied reference element within blocks, and 3) higher discrimination accuracy when the Gaussian distribution had a large generic mean (relative to the reference) and a small variance. In addition, our model replicated the signature of robust averaging reported in de Gardelle & Summerfield (2011), in which different stimulus features (color and shape) were used. Our modeling results suggest that robust averaging is attributed to the inhomogeneous encoding precision of inlying and outlying elements. Furthermore, they imply that efficient sensory representations of visual stimuli can be established on short timescale by learning the stimulus statistics over the course of a psychophysical experiment.