Abstract
Vision science commonly assumes that the human visual system is adapted to the statistical properties of natural inputs (Attneave, 1954, Barlow, 1961). The advent of digital photography in the 1990s spurred interest in measuring the statistical properties of photographs and comparing them to neural responses (Simoncelli & Olshausen, 2001). Relatively simple image statistics were found to be correlated with semantic differences between scenes (Torralba & Oliva, 2003). However, due to the limited size of these databases, we know little about the stability of these measurements across natural variations, such as viewpoint, seasonal variation, or moment-to-moment image changes associated with ordinary activity. To investigate the susceptibility of scene statistics to such variability, eleven participants recorded first-person videos during multiple walks around a college campus pond over the course of a year. For each of the 97 sessions, we extracted 2-second clips from four geographic regions of interest. For each frame in these clips, we computed three common image statistics: the slope of the power spectrum, RMS contrast, and color variability (defined as A-B entropy in LAB colorspace). Although the average power spectrum slope (-1.03) was within the range of previously-reported values (Ruderman, 1994), we observed considerable variability within individual 2-second clips (60% of overall variability) and between clips over the year (89% of overall variability). By contrast, RMS contrast and color entropy were more stable across viewpoints (26% and 34% of variability occurred within a session, respectively) but varied considerably across seasons. A broader class of low- and mid-level (junction frequencies) image statistics displayed varying stability patterns within and across clips. We argue that the variability of these statistics is an important object of study in its own right and that it also provides a bound on its utility for contributing to invariant scene perception.