Abstract
We typically think of visual perception as the recovery of increasingly elaborated information about individual objects in a scene. Recent research, however, suggests that other visual processes automatically exploit regularities of scenes in order to construct ‘statistical summary representations’. For example, human observers are able to quickly and effortlessly determine the mean size of a set of heterogeneous circles — even when they cannot reliably encode information about the particular individuals which compose such a set. To investigate the flexibility of these representations, we explored the types of objects over which such processes can operate. Observers viewed scenes consisting of various shapes, and reported whether the *average* shape size was greater on the left or right half of the display. We first illustrate the striking flexibility of this process by demonstrating that robust statistical summary representations can be formed even over highly degraded stimuli: for example, observers can easily compare the mean sizes of a set of circles and a second set of crosses, even when both sets are presented in a single display for only 300 ms. Previous research has assumed that mean sizes are compared on the basis of area, but our results show that more fundamental shape dimensions like diameter play a critical role. We also uncover important limitations of this ability: for example, observers are unable to selectively extract only the mean height or width of a set of ellipses. By showing that the heterogeneity and complexity of the stimuli modulate the ability to selectively extract information, we emphasize the stimulus-driven, automatic nature of statistical extraction. Collectively these experiments demonstrate how visual processing is streamlined via statistical summary representations, and more precisely how such representations are constructed.
Supported by NSF #0132444