Blue jays looking for moths, pigeons looking for seeds, and humans looking for colored rectangles all show remarkable similarities in their visual search behavior. One similarity is that search is fastest when the search target does not vary over time (Blough,
2002; Bond & Kamil,
2002; Kristjánsson, Wang, & Nakayama,
2002). Because the target is predictable, observers can selectively attend to a subset of the stimulus. This selection process is thought to involve the creation of a search template that specifies the target's appearance (Bundesen,
1990; Duncan & Humphreys,
1989; Hamker,
2005; Tinbergen,
1960; Vickery, King, & Jiang,
2005). This template is then used to bias the processing of sensory information in favor of stimuli that resemble the target (Desimone & Duncan,
1995; Usher & Niebur,
1996). While there is empirical support for the existence of a search template (Chelazzi, Duncan, Miller, & Desimone,
1998; Chelazzi, Miller, Duncan, & Desimone,
1993), the characteristics of this template are unknown.
One obstacle to characterizing the search template has been that many conditions that elicit selective attention also elicit perceptual priming (Kristjánsson et al.,
2002; Maljkovic & Nakayama,
1994). Perceptual priming is the enhanced processing of a stimulus that has been encountered repeatedly, especially if the stimulus was behaviorally relevant (Wiggs & Martin,
1998). A key difference between selective attention and perceptual priming is that selective attention depends on the observer's expectations, while perceptual priming operates independently of these expectations. Although these two processes are clearly distinct, they are typically confounded in search experiments. In most experiments, observers either search for the same target across trials or they preview the target before the onset of the search array (e.g., Vickery et al.,
2005; Zelinsky, Rao, Hayhoe, & Ballard,
1997). Because these methods use targets that are both predictable and repeated, they may invoke both selective attention and perceptual priming. In this study, we isolated the effect of selective attention by using targets that repeated infrequently and by prompting observers with name cues rather than image cues.
Isolating the effect of selective attention allowed us to examine the nature of the search template. Previous researchers have usually characterized the search template as either an exact image or as a set of basic features (Bourke & Duncan,
2005; Najemnik & Geisler,
2005; Rajashekar, Cormack, & Bovik,
2004; Usher & Niebur,
1996; cf. Rao, Zelinsky, Hayhoe, & Ballard,
2002). Such templates are not unreasonable for very simple stimuli, but they may be poorly suited for real objects. Basic features like ‘vertical” or ‘curved’ are too general to be effective search templates because these features are ubiquitous in natural images. In contrast, exact images are too specific to be effective search templates because they cannot tolerate the variations that arise when an object is seen from different viewpoints. The search templates that are used in everyday vision likely lie between the two extremes showing both specificity for particular objects or categories of objects and tolerance to viewpoint variation.
The goal of this study was to examine the specificity of the search templates used in everyday search. Although exact images are unlikely candidates because of their extreme specificity, these templates are well defined and so easily tested, and we used them as our starting point. We trained observers to associate target names with specific images so that when they were later cued with a name in a search experiment, they could form, at least in theory, an exact image template. To test whether observers actually use exact templates, we measured search performance for targets that varied in their similarity to the studied image.
The stimuli we chose for the experiment were photographic composites of coral reef scenes and the search targets were tropical fish (
Figure 1). Each scene contained at most one fish, and the observer's task was always simply to decide whether a fish (any fish) was present. The targets were drawn from ten fish species with highly distinctive colors and markings. We chose these stimuli for two reasons. First, because coral reef scenes are relatively unstructured, they place no constraints on the location of the fish targets. And second, because the distinctive patterns on the fish act as disruptive camouflage (Stevens, Cuthill, Windsor, & Walker,
2006), they conceal the fish's shapes. Assuming that shape is the key attribute for categorizing an object as a fish, these patterns make it difficult to search for fish in general. At the same time, the distinctiveness of the patterns makes it easy to search for a specific fish. Maximizing the difference between search for the basic-level category and search for an individual was important because our effect size was constrained by the difference between searching with and without cues to the target fish's identity.
Before participating in the experiment, the observers learned to associate the names of five species with an image from each species. During the experiment, the observers were cued with one of the five studied names before the onset of the reef scene. On some trials the target in the scene was the image that observers had learned to associate with the species name, but on other trials the target was only similar to this image. To examine the specificity of the observer's search template we used three levels of target variation:
-
no variation—the target was identical to the studied image,
-
2D viewpoint variation—the target was a rotated, flipped and scaled version of the studied image, and
-
subordinate level variation—the target was from the same species as the studied image.
As controls, we also included a nonspecific cue condition (the cue was simply the word “fish”) as well as target fish from species that had not been studied.
On the surface, the interpretation of this experiment seems fairly straightforward: If the name cues prompt observers to use an exact image template to search for the target, then we expect a cue advantage only for the targets that are identical to the studied image. If the name cues prompt observers to use a less specific template that is tolerant to viewpoint and exemplar variation, then we expect the name cue to benefit all three conditions relative to the nonspecific cue condition. This interpretation is complicated, however, by the uncontrolled nature of the stimuli. We elected to use natural objects and natural categories because we wanted to study the type of problem that the visual system is designed to solve. But because it is unclear how to measure the visual similarity of natural objects, we could not quantify the differences between our conditions. If, for example, the results showed no difference between the same-image and same-species conditions, this might reveal something about the nature of the search template, or it might simply indicate that there is little variation within species. The interpretation of these results would be clearer if we had another measure of the perceptual similarity of our stimuli.
To obtain a second measure of the perceptual similarity of our stimuli, we conducted a preliminary experiment with image cues. The image cue was either the target image, a transformed version of the target image, or an image of a fish from the same species as the target image. Previous research has shown that the best image cue is an exact match of the target: If the cue and target differ in size, orientation, or if they are different exemplars of the same type, then the cue is less effective (Vickery et al.,
2005; Wolfe, Horowitz, Kenner, Hyle, & Vasan,
2004). Based on these results, we expected that the image cue experiment would provide a sensitive measure of the similarity of our stimuli.