Abstract
Visual search performance is typically measured as a function of set size, but it is unclear how to determine set size in cluttered scenes. Previously, we claimed that in such scenes, set size corresponds to the number of segmentable regions rather than the number of objects (Bravo & Farid, 2003). We supported this claim by showing that distractors with multiple regions produce steeper search functions than distractors with a single region. Our goal this year was to quantify this relationship by using a computational model to count the regions in our clutter stimuli.
There are many computational models for segmenting an image into regions; of these, graph-based approaches have shown particular promise. We employed one such algorithm (Felzenszwalb and Huttenlocher, 2004) to count the number of regions in our 2003 stimuli. We then used this measure of set size to replot the search time data. Using the number of segmentable regions as the measure of set size produced a better fit (R^2=0.995) than did using the number of distractor objects (R^2=0.916).
Like all computational models of image segmentation, the output of the algorithm is highly scale-dependent. By adjusting the algorithm's parameters, a single image can be segmented into 50 regions or 500 regions. This variability is not a limitation of the algorithm; it reflects the scale ambiguity inherent in image segmentation. To determine whether our choice of parameters was fortuitous, we explored the parameter space and found that nearly every set of values produced an excellent fit to our data. Evidently, the number of regions in our stimuli is roughly proportional over a wide range of scales. We have confirmed that this proportionality also holds across many natural images.
We conclude that computational models of image segmentation can provide a good measure of relative set size in cluttered stimuli.