Observers can access knowledge about the overall gist and spatial structure of a scene within 100 ms or less (e.g., Biederman,
1981; Potter,
1976; Greene & Oliva,
2009). This knowledge can assist subsequent search. Eye guidance is improved and response times are shortened, for instance, with a scene preview, even brief, compared to situations without a scene preview (e.g., Castelhano & Heaven,
2010; Castelhano & Henderson,
2007; Hillstrom, Scholey, Liversedge, & Benson,
2012; Hollingworth,
2009; Võ & Henderson,
2010) or when the preview is just a jumbled mosaic of scene parts (Castelhano & Henderson,
2007; Võ & Schneider,
2010). However, neither a preview of another scene from the same basic-level category (Castelhano & Henderson,
2007) nor cueing the searching scene with its basic category verbal label (Castelhano & Heaven,
2010) seem to facilitate search. What appears crucial, indeed, is the guidance supplied by the physical background context of the scene: Previewing the component objects without background is not beneficial (Võ & Schneider,
2010). This is in line with the fact that searching for arbitrary objects is far more efficient when they are embedded in scenes with consistent background than when they are arranged in arrays on a blank background: While the estimated search slope in a consistent scene is about 15 ms/item, it increases to about 40 ms/item in the absence of any scene context (Wolfe, Alvarez, Rosenholtz, & Kuzmova,
2011). In visual search, knowledge about the spatial structure of scenes enables rapid selection of plausible target locations, biasing search to a subset of regions in the scene. This has been mainly shown with images of everyday scenes presented on a computer screen (Eckstein, Drescher, & Shimozaki,
2006; Henderson, Weeks, & Hollingworth,
1999; Malcolm & Henderson,
2010; Neider & Zelinsky,
2006; Torralba et al.,
2006; Võ & Wolfe,
2013; Zelinsky & Schmidt,
2009), but there is evidence that placing the target in an expected location facilitates search also in real-world environments. On this point,
Mack and Eckstein (2011) used a search task in which the target object was on a cluttered table in a real room, placed next to objects usually co-occurring with it or among unrelated objects: Fewer fixations were necessary to find it and search times were shorter in the first case.