While V1-based models have been applied most successfully to the visibility of grating stimuli, they have also been used to model the detectability of objects or changes in natural scenes (Lovell, Párraga, Ripamonti, Troscianko, & Tolhurst,
2006; Párraga, Troscianko, & Tolhurst,
2005; Peters, Iyer, Itti, & Koch,
2005; Rohaly, Ahumada, & Watson,
1997; Tolhurst, Párraga, Lovell, Ripamonti, & Troscianko,
2005). Most particularly, V1-based models have been used to devise quality metrics for images or video clips of natural scenes that have been distorted by deliberate compression algorithms, or by noise or bit-loss in transmission (Barten,
1990; Chandler & Hemami,
2007; Daly,
1993; Feng & Daly,
2003; Ferwerda, Pattanaik, Shirley, & Greenberg,
1997; Lubin,
1995; Teo & Heeger,
1994; Wang, Bovik, Sheikh, & Simoncelli,
2004). Often, in such studies, the number and variety of images used has been small, or the types of changes presented have been limited to a single dimension, e.g. JPEG or blur degradation. Although our study bears some resemblance to earlier research on metric differences, we want to emphasize that the present work asks whether a model based on V1 properties can predict the magnitude of perceived changes in
a variety of plausible suprathreshold manipulations to the contents of natural scenes. We do not aim to provide another image quality metric, but to test the validity of V1 models when operating on representations of
natural changes within images such as might happen in time-lapse photography of a busy scene, and so this paper will examine how observers perceive differences in a very large number of naturalistic stimuli containing differences that span several feature dimensions, separately or in combination (e.g. blur, hue, saturation, or object shape, size or number). Very many of our stimuli exhibit totally natural changes in scenes, caused by changes in the weather, or the movement of objects, animals and people within the scenes. Unlike the change-blindness literature (Simons & Rensink,
2005), our scene changes are designed not to be challenging for attentional or memory systems.