Abstract
We introduce the Proto-object model of visual clutter perception. This unsupervised model segments an image into superpixels, then merges neighboring superpixels that share a common color cluster to obtain proto-objects, defined here as spatially extended regions of coherent features. Clutter is estimated by a simple count of the number of proto-objects. We tested this model using 90 images of realistic scenes selected from the SUN09 image collection to have objects ranging in number from 1-60 (as determined by the supplied object ground truth). We then had 15 observers individually rank order these scenes from least to most visually cluttered, then took the median of these rankings to obtain a single behavioral ranking. Comparing this behavioral ranking to a second ranking based on the models clutter estimates, we found that the two correlated highly (Spearmans ρ = .804, p <.001). Follow-up analyses also showed that the Proto-object model was highly robust to changes in its parameters and was generalizable to unseen images, as determined by 10-fold cross validation. We compared the Proto-object model to six other models of clutter perception and demonstrated that it outperformed eachin some cases dramatically. Importantly, we also showed that the Proto-object model was a better predictor of clutter perception than an actual count of the number of objects in the scenes, suggesting that the "set size" of a scene may be better described by proto-objects than objects. We conclude that the success of the Proto-Object model, the new standard for models of clutter perception, is due in part to its use of an intermediate level of visual representationone between features and objectsand that this is evidence for the potential importance of a proto-object representation in many common visual percepts and tasks.
Meeting abstract presented at VSS 2014