Abstract
Humans can categorize complex natural scene quickly and accurately. Which properties of scenes enable such an astonishing feat? We have recently found the neural representation of scene categories in the PPA to be compatible for line drawings and photographs of scenes (Walther et al., PNAS 2011). This finding allows us to use line drawings as stand-ins for photographs when investigating scene categorization. We performed a six-alternative forced-choice (6AFC) scene categorization experiment and verified that participants could categorize scenes based on line drawings as well as photographs with presentation times as short as 27ms. To explore the critical scene properties we extracted five sets of properties from the line drawings: contour length, orientation, and curvature, and type and angle of contour junctions. We then categorized natural scenes computationally based on the statistical distributions of these properties. Orientation allowed for the highest categorization accuracy. However, we found that the pattern of categorization errors for curvature, junction type and angle provided the best match with errors made by humans in the 6AFC experiment. Thus, properties of junctions appear to be particularly relevant for the human ability to categorize scenes. We verified this computational prediction in an additional behavioral experiment with manipulated line drawings of scenes, in which the junctions were perturbed while preserving contour length, orientation and curvature. As expected, this manipulation led to a significant decrease in categorization accuracy. Our results indicate that the human ability to categorize complex natural scenes is to a large extent driven by the structure of scenes, which is described by contour junctions. Line orientation, which is tightly linked to the spatial frequency spectrum, is useful for computational scene categorization but does not match human behavior. This finding challenges the popular view that natural scene categorization relies on statistical regularities of the spatial frequency spectrum.
Meeting abstract presented at VSS 2013