Abstract
Vision researchers have long studied the effect of clutter on visual search performance. Traditionally, researchers have characterized clutter using “target-agnostic” measures that rely only on characteristics of the image itself, irrespective of the observer’s task (e.g., the identity of the search target). Here, we are interested in clutter in natural images, and in how the effect of clutter might vary with the target of the search. Along these lines, Semizer and Michel (2022) instructed observers to search natural scenes for objects belonging to a particular category (e.g., cellphones). They then proposed two novel metrics to characterize image clutter in a target-relevant way, based on the similarity between the background scene features and those of the target. The “exemplar-level” metric only considers the features of the target exemplar present in a particular search image. This metric is only available when a target is present in the image. In contrast, the “category-level” metric considers the feature distribution across all exemplars of the target category within the image set. This latter metric is available regardless of whether a target is present in the image. Semizer and Michel (2022) defined both metrics using Steerable Pyramid features (Simoncelli & Freeman, 1995). Here, we redefined these metrics using a more flexible feature description, Histograms of Oriented Gradients (Dalal & Triggs, 2005), that is robust to perturbations in scale and position. We then fitted generalized linear models to the exemplar-level and category-level metrics, along with a traditional (target-agnostic) clutter metric to see how well they predict search time. Our results show that, for images with a target present, the target-agnostic and exemplar-level metrics predict search time better than does the category-level metric. Moreover, when the search target is absent, the category-level metric contributes significantly (i.e., beyond the predictions of the target-agnostic metric alone) to explaining search time.