Abstract
Figure/ground organization is a step of perceptual organization that assigns a contour to one of the two abutting regions. Peterson et al showed that familiar configurations of contours, such as outlines of recognizable objects, provide a powerful cue that can dominate traditional f/g cues such as symmetry. In this work we: (1) provide an operationalization of “familiar configuration” in terms of prototypical local shapes, without requiring global object recognition; (2) show that a classifier based on this cue works well on images of natural scenes.
A dataset of 200 natural images was hand segmented into disjoint regions by human subjects. Subjects then provided a f/g label for each contour associated with a pair of abutting segments [Fowlkes, Martin & Malik ECVP03]. Our goal is to correctly predict these f/g labels from image measurements.
We use “shape context” to represent local shape configuration at each point. Shape context [Belongie, Malik & Puzicha ICCV01] is a shape descriptor which summarizes local arrangement of edges, relative to the center point, in a log-polar fashion. In order to work with grayscale images, we use a variant of shape context, geometric blur [Berg & Malik CVPR01], aligned to local tangent direction. We cluster a large set of these descriptors to construct a small list of prototypical shape configurations, or “shapemes” (analogous to phonemes). Shapemes capture important local structures such as convexity and parallelism.
For each point along a contour, we measure the similarity of its local shape descriptor to each shapeme. These measurements are combined using a logistic regression classifier to predict the f/g label. By averaging the classifier outputs over all points on each contour, we obtain an error rate of 30% (chance is 50%). This compares favorably to the traditional f/g cues used in [Fowlkes et al 03]. Enforcing consistency constraints at junctions reduces the error rate further to 22%, making it a promising model of figure/ground organization.