August 2012
Volume 12, Issue 9
Vision Sciences Society Annual Meeting Abstract  |   August 2012
A large-scale taxonomy of real-world scenes
Author Affiliations
  • Michelle Greene
    Stanford University, Department of Computer Science
  • Li Fei-Fei
    Stanford University, Department of Computer Science
Journal of Vision August 2012, Vol.12, 798. doi:10.1167/12.9.798
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Michelle Greene, Li Fei-Fei; A large-scale taxonomy of real-world scenes. Journal of Vision 2012;12(9):798. doi: 10.1167/12.9.798.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Scene classification is critical to human scene understanding. The scientific study of scene perception requires a shared vocabulary and taxonomic organization of scene categories. What classes of scenes are there? The only attempt at building a cognitive scene taxonomy comes from a small study using eight categories (Tversky & Hemenway, 1983). Here, we endeavor to fill this knowledge gap by creating a more comprehensive taxonomy of scene categories using a large set of image categories that better approximates the richness of the real world. Experiment 1 examined 100 studies in visual cognition and computer vision that listed "scene categorization" or "scene classification" as keywords. We tabulated the categories examined in each study, finding a total of 1195 unique category names. Category occurrence roughly followed a power law: many categories occurred in only one study (n=418) while few categories (n=20) were found in at least 10% of studies. The 1195 categories vary in their level of abstraction and represent a highly diverse set of entities, ranging from proper nouns, events, objects, animals and people. How are these scene categories organized in a conceptual space that reflects human cognition and perception? Experiment 2 examined these questions via a large-scale online categorization experiment. We amassed a database of 1055 putative scene categories taken from Experiment 1. Participants viewed pairs of images that were either drawn from either the same or different categories, then indicated whether they would place them in the same category. Results indicate only a small number of scene categories have high participant agreement (~2%). Hierarchical clustering reveals multiple levels of class similarity. Altogether, we provide the first large-scale attempt at a full taxonomy of real-world scenes, a critical step for furthering the study of human scene representation and organization.

Meeting abstract presented at VSS 2012


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.