Abstract
Our effortless interaction with our visual world leaves us with the impression that we have an accurate and complete representation of all that we see. However, examination of the textures in our environment reveals that this is not reality: often we perceive a bed of roses or a crowd of faces, not the individual rose or singular face. This loss of information is by design, as it is more efficient to represent summary statistics about crowds of items than it is to code each individual item. Here, we report that statistical representation occurs at both low and high levels of visual processing. We found that observers precisely extracted the ‘mean emotion’ of a set of faces, even when it was impossible to code every member of the set. Specifically, participants were more likely to identify a test face as a member of the previously displayed set as it approached the mean emotion of the set. Despite this implicit knowledge of the mean, participants were unable to correctly identify which of two test faces was a member of the previously presented set, suggesting they lacked individual item representation. When explicitly asked to indicate whether a test face was happier or sadder than the mean of the set, observers were remarkably precise. Statistical set representation was quite stable, as it occurred regardless of set size or duration, up to 16 items displayed for as little as 500 ms. When observers viewed sets of inverted faces and fractured faces, patterns of performance paralleled that of upright faces, but overall thresholds were significantly higher. Thus, while statistical representation of inverted and fractured faces is possible, the strategy employed is distinct from and less efficient than that used for upright faces. The results demonstrate that statistical extraction occurs at multiple stages of visual processing.
Faraz Farzin, David Horton