Visual categorization is a fundamental cognitive process that allows for efficient action in the world. However, it is not yet known whether categorization proceeds automatically from visual input, or is directed by attentional processes. Automatic processes are those well-learned processes that demand little attention. They can often be computed in parallel, and are obligatory in nature, that is, difficult to ignore, alter, or suppress (Shiffrin & Schneider,
1977). The status of visual categorization remains controversial. Both object (Grill-Spector & Kanwisher,
2005) and scene recognition (Fei-Fei, Iyer, Koch, & Perona,
2007; Greene & Oliva,
2009b; Thorpe, Fize, & Marlot,
1996) occur rapidly and seemingly without effort. The impressive performance of human observers in rapid visual categorization has been taken as evidence for the automaticity of categorization (Grill-Spector & Kanwisher,
2005; Thorpe et al.,
1996). However, these results are at odds with theoretical (Fodor,
1983; Pylyshyn,
1999), computational (Riesenhuber & Poggio,
2000) and neurophysiological (Freedman, Riesenhuber, Poggio & Miller,
2001) models that separate categorization from purely visual processes. Furthermore, the attentional requirements for visual categorizations remain controversial (Cohen, Alvarez, & Nakayama,
2011; Evans & Treisman,
2005; Li, VanRullen, Koch, & Perona,
2002). This controversy may be explained in part by the difficulties of separating categorization processes from the processing of visual features that are diagnostic for a class (Delorme, Rousselet, Mace, & Fabre-Thorpe,
2004; Evans & Treisman,
2005; Johnson & Olshaussen,
2003; McCotter, Gosselin, Sowden, & Schyns,
2005). In this work, we examine the extent to which objects and scenes are categorized when the images themselves are task-irrelevant using a Stroop-like paradigm. We will then show that this paradigm can be used to assess the entry-level categories of real-world scenes.