Abstract
Research in ensemble perception has documented that people can calculate the mean of a feature distribution with relatively high precision. How do we calculate that “mean”? Whereas some models assume that the visual system simply averages a subsample items, other models based on population coding (Haberman & Whitney, 2012; Hochstein, VSS 2018) suggest that the “mean” is represented as a peak response or vector sum of a neural population response to all items. Recently, Utochkin (ECVP 2019) has proposed a hypothetical model of ensemble coding based on population coding by neurons with large receptive fields that pool local feature signals from lower-level populations with smaller receptive fields. The activation profile of this pooled response, with a “central tendency” peak, inherently results from tuning curves of pooling neurons reflecting the distribution of synaptic weights of the local signals. Our computational implementation of this pooling+population coding model predicted that in ensembles with skewed feature distributions, the peak of a population response would shift away from the physical mean of a distribution toward its mode. The larger the skew, the larger the shift of the peak is predicted. To test this prediction, we asked participants to adjust the mean orientation of 25 triangles. We varied the skew of an orientation distribution and measured the systematic deviation of mean orientation estimates. We found that the estimates followed this prediction. Importantly, the amount of bias away from the physical mean showed excellent fit (R^2 = 0.99) to a model based on realistic tuning properties of broadly-tuned V4 orientation-selective neurons (McAdams & Maunsell, 1999). We conclude, therefore, that pooling+population coding in higher-level visual areas can be a plausible neural mechanism of ensemble averaging and that V4 neurons can be potentially involved in ensemble encoding for orientation.