Abstract
Past literature has suggested that summary statistics for groups (i.e., ensembles) of faces or objects can be rapidly extracted, with visual ensemble perception typically becoming more efficient as set size increases. Tharmaratnam and colleagues (VSS 2019) recently demonstrated that the average scene content (i.e., perceived naturalness or manufacturedness) and spatial boundary (i.e., perceived openness or closedness) of scene ensembles can be extracted without reliance on visual working memory (VWM) resources. However unlike past literature, the task difficulty of extracting average scene ensemble features increased with increasing set sizes. To investigate this further, in the present study we varied scene ensemble task difficulty by manipulating scene feature complexity. In Experiment 1, participants were asked to report the average orientation of ensembles of rotated scenes (simpler feature). In Experiment 2, participants were asked to report the average sound level of ensembles of scenes that varied in perceived sound quality (i.e., noisy or quiet; complex feature). In both experiments, we varied set size by randomly presenting 1, 2, 4, or 6 scenes to participants on each trial, and additionally measured VWM capacity using a two-alternative forced-choice task. We found that participants were able to accurately extract summary statistics for both ensemble scene features. Importantly, all 6 items were integrated into their percepts without relying on VWM, as less than 1.3 scenes were remembered on average. Interestingly, when set-size increased, task performance did not change when rating average scene orientation, but decreased when rating average sound level. This latter finding is consistent with our previous findings measuring average scene content and spatial boundary. Taken together, these results broaden our understanding of the cognitive mechanisms governing ensemble perception by demonstrating that the number of items and the feature complexity of the incoming sensory information both contribute to the formation of ensemble representations.