Abstract
Perception of objects depends on neurons in the ventral visual stream that display high selectivity in responding to complex patterns. These neurons tend to have large receptive fields, which support position-invariant object recognition but also entail poor spatial resolution. Specifically, when multiple objects fall within the receptive field of a ventral visual neuron, unless selective attention is deployed to one specific object, the neuron's response tends to be the average of its responses to the individual objects. With complex stimuli such as facial expressions this neural averaging predicts that multiple expressions within a receptive field would be perceptually averaged. Perceptual averaging of facial expressions should occur across large spatial separations but only when faces are presented within the same visual hemifield, given that the receptive fields of face-tuned neurons have large receptive fields mostly confined within the contralateral visual hemifield. On each trial of our experiments, two faces (separated by ∼ 7°) were briefly presented (100 ms) either within the same visual hemifield or across visual hemifields. Observers rated the valence of one face from the pair, which was indicated by a post-cue. Consistent with the prediction from neural averaging, when a face with an angry expression and a valence-neutral face with a surprise expression were presented within the same visual hemifield, observers perceived the valence-neutral face to be more negative and the angry face to be less negative compared to when the two faces were presented across visual hemifields. We thus demonstrated perceptual consequences of neural averaging with high-level visual features. When viewing time is limited and spatial selective attention is not engaged in advance, a strong feature gets softened and its influence spreads to weaker neighbors.
This work was supported by NIH gant EY14110 and NSF grant BCS0643191.