Humans can rapidly analyze the gist of briefly presented natural scenes despite the apparent complexity of the task (Fei-Fei, Iyer, Koch, & Perona,
2007). For instance, participants can rapidly classify natural images according to whether they contain targets from prespecified categories or not (Thorpe, Fize, & Marlot,
1996), even when attention is allocated to another task (Cohen, Alvarez, & Nakayama,
2011; Li, Van Rullen, Koch, & Perona,
2002). However, when the scene to be categorized contains four foreground objects, categorization performance suffers under dual-task conditions, suggesting that processing complex scenes that contain multiple objects requires serial attentional processing (Walker, Stafford, & Davis,
2008). Yet, participants can report if one animal is present in one of two scenes as fast as they report the presence of an animal in a single image (Fei-Fei, VanRullen, Koch, & Perona,
2005; Rousselet, Fabre-Thorpe, & Thorpe,
2002), provided that the two scenes are presented sufficiently far away from each other to minimize interference (VanRullen, Reddy, & Fei-Fei,
2005). Other studies have reported that behavioral performance (RTs and accuracy) suffers if the search for a target object has to be performed simultaneously on multiple scenes (Rousselet, Thorpe, & Fabre-Thorpe,
2004a,
2004b). As stated, this decrease in performance is partly explained by interstimulus spacing (VanRullen et al.,
2005, but see Fei-Fei et al.,
2005), but it could also be due to the fact that processing multiple different scenes is extremely demanding and artificial (Rousselet et al.,
2004b). In real life, humans are typically required to resolve objects embedded in a single scene. VanRullen and Koch (
2003) showed that participants correctly reported two to three items from briefly presented scenes each of which contained 10 different objects. However, the result leaves open the question whether processing multiple objects from natural scenes is associated with costs in processing time relative to a single stimulus condition (e.g., due to serial shifts of attention).