Abstract
Visual working memory is a reconstructive process that requires integrating multiple hierarchical representations of objects. This hierarchical reconstruction allows us to overcome perceptual uncertainty and limited cognitive capacity, but yields systematic biases in working memory, as individual items are influenced by the ensemble statistics of the scene, or of their particular group. Given the importance of the hierarchical encoding of a display for visual memory, we aim to characterize what structured priors people use to encode visual scenes. To discover these priors, we use an iterated learning task, in which participants recall the locations of 15 dots and the report of one person becomes the stimulus for the next person in the “chain”. Over many iterations of such a chain, reported locations will increasingly reflect the prior biases that people bring to bear on the encoded visual display. Previously, we showed that such iterated learning chains reveal the patterns of spatial grouping people expect when encoding homogenous positions -- priors that appear to correspond to the Gestalt rules of proximity, continuity and similarity. The current study further examines how surface features (namely colors) influence encoding of visual displays and influence spatial grouping in visual working memory. We found that distinct colors dominate spatial factors (such as proximity and continuity) in grouping. The reported positions tend to converge toward tight, segregated color clusters that are increasingly linear, and regularly spaced. This indicates that distinct colors dominate spatial proximity and continuity in visual working memory grouping. We built several ideal observer clustering models to identify which assumptions are critical to emulate human behavior, and find that human performance is consistent with spatial clustering under strong assumptions of within-group color homogeneity.