However, our visual environment is made up both of discrete objects and also of extended surfaces which form a spatial layout, and there is significant evidence that our visual system processes these types of information separately. For example, fMRI studies in humans show evidence for regions of the brain that respond selectively to scenes compared to objects (Epstein,
2005; Epstein & Kanwisher,
1998; Kravitz, Saleem, Baker, & Mishkin,
2011) and which seem to represent features of a scene's spatial layout rather than the objects it contains (Epstein,
2005; Park, Brady, Greene, & Oliva,
2011). In addition, it is possible to recognize briefly presented scenes even without being able to recognize any of the objects in those scenes (Oliva & Torralba,
2001; Schyns & Oliva,
1994), providing evidence of the independence of scene recognition from object recognition. Greene and Oliva (
2009) proposed that this ability could arise from the representation of global properties of scenes, such as the “perspective” or “openness” of a scene. Past research has also drawn distinctions between other types of scene information that may be represented; for example, scene meaning (sometimes called “gist”; e.g., if the scene is a beach, a dining room, etc.; Oliva,
2005, Oliva & Torralba,
2006) and the spatial layout of scenes (Epstein,
2005). Finally, evidence suggests that scene structure, including the spatial layout of a scene, is crucial to guiding our attention during visual search for objects, and may be represented in a global way independent of object processing (e.g., Torralba, Oliva, Castelhano, & Henderson,
2006; Wolfe, Võ, Evans, & Greene,
2011). However, despite this evidence for distinct representations of scenes (separate from those of objects), little work has investigated how scene-specific spatial layout information is maintained across saccades or brief delays, with most work on scene memory focusing on the role of memory for objects within scenes (Hollingworth,
2004,
2005).