Abstract
Scene classification is critical to human scene understanding. The scientific study of scene perception requires a shared vocabulary and taxonomic organization of scene categories. What classes of scenes are there? The only attempt at building a cognitive scene taxonomy comes from a small study using eight categories (Tversky & Hemenway, 1983). Here, we endeavor to fill this knowledge gap by creating a more comprehensive taxonomy of scene categories using a large set of image categories that better approximates the richness of the real world. Experiment 1 examined 100 studies in visual cognition and computer vision that listed "scene categorization" or "scene classification" as keywords. We tabulated the categories examined in each study, finding a total of 1195 unique category names. Category occurrence roughly followed a power law: many categories occurred in only one study (n=418) while few categories (n=20) were found in at least 10% of studies. The 1195 categories vary in their level of abstraction and represent a highly diverse set of entities, ranging from proper nouns, events, objects, animals and people. How are these scene categories organized in a conceptual space that reflects human cognition and perception? Experiment 2 examined these questions via a large-scale online categorization experiment. We amassed a database of 1055 putative scene categories taken from Experiment 1. Participants viewed pairs of images that were either drawn from either the same or different categories, then indicated whether they would place them in the same category. Results indicate only a small number of scene categories have high participant agreement (~2%). Hierarchical clustering reveals multiple levels of class similarity. Altogether, we provide the first large-scale attempt at a full taxonomy of real-world scenes, a critical step for furthering the study of human scene representation and organization.
Meeting abstract presented at VSS 2012