Abstract
Purpose: Both local and configural processes play a role in the figure-ground organization of scenes. Local factors include bottom-up edge segmentation enabling small regions to be fused into figural regions. The Berkeley Segmentation Dataset (Martin, Fowlkes, Tal, and Malik (2001)) provides a corpus of images whose contours were hand segmented by humans, making it ideal for studying local processes. Configural factors include top-down processes such as grouping and meaningfulness. Barghout (2009) created a complementary dataset designed to capture the configural information required for studying top-down processes. In this study we collected more data for the images in this corpus to enable rigorous statistical analysis. The data-collection paradigm assumes that asking someone to mark the “center of the subject of the photograph” serves as a proxy of the figural status of the region centered at the point marked. Since the method does not distinguish between a foreground, a single object or an object within an object, we coined the term “spatial taxon” to refer to the object or object group centered at the position indicated. This operational definition is analogous to - but also much broader than - the term “figure” as defined in the literature. Methods: Participants were asked to “mark the center of the subject of the photograph” and label it. K-means clustering was used to determine spatial taxons, as operationally defined above. Rank-frequency distributions for spatial taxons and corresponding word labels were fit via linear regression.
Results: Results suggest natural-scene-perception architecture comprised of nested hierarchies whose rank-frequency, and corresponding word rank-frequency are described by inverse power laws. Unlike the results for natural scenes, spatial taxon rank-frequency distribution for two of the three ambiguous figures were uniform and word labels corresponded to their percepts; supporting the assumptions underlying the methodology.