Abstract
Visual memory is subject to systematic errors. By understanding how these errors come about, we can uncover fundamental processes that shape the visual representations of human memory. One of the most robust and perplexing types of memory error is boundary transformation, in which observers reliably misremember a scene as either farther (boundary extension) or closer (boundary contraction) than it actually was. What drives these boundary-transformation errors? The normalization theory proposes that scene memories are biased toward canonical views. For example, if our view of a scene is unusually close, our memory will be biased toward a farther and more typical view, showing boundary extension. This theory raises a central question that has yet to be addressed: Can boundary-transformation effects be predicted from the natural statistics of observed viewpoints in real-world scenes? Here we leveraged a large sample of scenes with ground-truth depth maps to quantify the natural statistics of viewing distance. We characterized the distributions of mean depth values in natural images for multiple categories of indoor scenes (e.g., kitchen, office) and used these as estimates of viewing distance. We then performed a series of behavioral experiments to determine if remembered scene boundaries transform toward the most frequent depth for different categories. Based on experiments involving both real-world scenes and well-controlled virtual environments, we found that the natural statistics of scene depth are predictive of the direction and magnitude of boundary-transformation effects. Furthermore, we found that an analogous phenomenon exists for images of individual objects—remembered objects are transformed toward their average sizes in real-world images. Together, these findings demonstrate that the natural statistics of images are predictive of memory errors for image boundaries, and they suggest a normalization process by which memories of scenes and objects are biased toward canonical views.