August 2007
Volume 7, Issue 11
Free
Research Article  |   August 2007
Task-set switching with natural scenes: Measuring the cost of deploying top-down attention
Author Affiliations
Journal of Vision August 2007, Vol.7, 9. doi:https://doi.org/10.1167/7.11.9
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Dirk B. Walther, Li Fei-Fei; Task-set switching with natural scenes: Measuring the cost of deploying top-down attention. Journal of Vision 2007;7(11):9. https://doi.org/10.1167/7.11.9.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

In many everyday situations, we bias our perception from the top down, based on a task or an agenda. Frequently, this entails shifting attention to a specific attribute of a particular object or scene. To explore the cost of shifting top-down attention to a different stimulus attribute, we adopt the task-set switching paradigm, in which switch trials are contrasted with repeat trials in mixed-task blocks and with single-task blocks. Using two tasks that relate to the content of a natural scene in a gray-level photograph and two tasks that relate to the color of the frame around the image, we were able to distinguish switch costs with and without shifts of attention. We found a significant cost in reaction time of 23–31 ms for switches that require shifting attention to other stimulus attributes, but no significant switch cost for switching the task set within an attribute. We conclude that deploying top-down attention to a different attribute incurs a significant cost in reaction time, but that biasing to a different feature value within the same stimulus attribute is effortless.

Introduction
Imagine walking into a crowded restaurant, looking for your friend whom you are supposed to meet. You will be looking around, scanning the faces of the patrons for your friend's without paying much attention to the interior design or the furniture. Entering the same restaurant with the intention of finding a suitable table, on the other hand, will have you looking at pretty much the same scene, and yet your perception will be biased for the arrangement of the furniture, mostly ignoring the other guests. 
Task or agenda affect our visual perception by a set of processes that we commonly term top-down attention to distinguish them from stimulus-driven bottom-up attention (Itti & Koch, 2001; Treisman & Gelade, 1980). It enables us to preferentially perceive what is important for the task at hand. Without top-down attention to the relevant parts of a scene, we may even miss large changes (Rensink, Oregan, & Clark, 1997; Simons & Levin, 1998; Simons & Rensink, 2005) until we are explicitly cued. If cueing changes our visual perception so effectively, what is the cost of deploying attention to a new stimulus attribute? 
Humans can detect object categories in natural scenes in as little as 150 ms (Potter & Levy, 1969; Thorpe, Fize, & Marlot, 1996). Li, VanRullen, Koch, and Perona (2002) demonstrated that this can be achieved even when spatial attention is tied to a demanding task elsewhere in the visual field. At this fast processing speed, there is not enough time for feedback within the visual hierarchy, suggesting purely feed-forward, bottom-up processing. Because of their block design, these experiments allow subjects to prepare for the given task well in advance, giving them ample time to bias their visual system accordingly. 
We are interested in the reaction time (RT) cost for adjusting the visual system for a new task or a task set from trial to trial. By comparing switches that require shifting attention across different attributes (from color to image content) with those that do not require a shift of attention, we ask how efficient it is to bias the visual system from the top down to allow for subsequent fast feed-forward processing of stimuli. 
Wolfe, Horowitz, Kenner, Hyle, and Vasan (2004) approached a similar question for visual search by cueing odd-one-out search tasks in sets of 6–18 items, finding an RT cost of up to 200 ms for picture cues and 700 ms for word cues for the mixed versus blocked condition. These endogenous cues take about 200 ms to become fully effective. However, with their design, Wolfe et al. were not able to separate the cost for deploying top-down attention from the cost for other processes such as perceiving and interpreting the cue. 
We address this question by adapting the task switching paradigm, recently reviewed by Monsell (2003), to fast natural scene categorization tasks. Task switching was introduced by Jersild (1927), who had students work through lists of simple computation tasks (adding and subtracting 3 from numbers). He found that blocks in which the two tasks alternate require considerably more time than blocks with single tasks. This result was later verified by Spector and Biederman (1976). 
In general, task switching experiments require subjects to perform two or more tasks that typically relate to different attributes of the stimulus; for example, reporting whether a number is odd or even versus whether it is smaller or greater than 5. Subjects are tested in blocks with single tasks and in mixed-task blocks. In mixed-task blocks, the tasks can either alternate in a prespecified sequence, for example, “AABBAABB” (Allport, Styles, & Hsieh, 1994; De Jong, 2000; Rogers & Monsell, 1995), or the task order can be unpredictable, and an explicit task cue is presented before stimulus onset (Meiran, 1996; Sudevan & Taylor, 1987). For a comparison of the two paradigms, see Koch (2005). 
In either case, there will be trials with task repeats and trials with task switches in mixed-task blocks. RTs tend to be longer for switch trials than for repeat trials. The difference is termed “switch cost.” Although no actual switch needs to happen in repeat trials, RTs will still be longer than those in single-task blocks, giving rise to a “mixing cost.” Both switch and mixing cost depend on the preparation time from the presentation of the cue or, in the absence of a cue, from the previous trial to the stimulus onset of the current trial. It is frequently observed that even with long preparation times of up to 5 s, there is still a considerable residual switch cost (Kimberg, Aguirre, & D'Esposito, 2000; Sohn, Ursu, Anderson, Stenger, & Carter, 2000). 
Switch cost is generally credited to a process of task-set reconfiguration, including a shift of attention between stimulus attributes, selection of the correct response action, and, depending on the task, reconfiguration of other task-specific cognitive processes. Mixing cost captures the extra effort involved in potentially (but not actually) having to switch to another task compared to a single-task condition, such as time for cue perception and interpretation. 
When attempting to determine the cost of shifting attention, switch cost is the more interesting effect. However, cost for attention shifts is confounded with other costs such as cost for cue encoding (Logan & Bundesen, 2003; Schneider & Logan, 2005) or motor response selection (Meiran, 2000) in its contribution to switch cost. To disentangle these effects, we have devised a paradigm with four tasks divided into two task groups of two tasks each (Figure 2). Tasks within groups relate to the same stimulus attribute and hence do not require an attentional shift, whereas switching between task sets from different groups requires shifting attention to a different stimulus attribute. Because the only difference in within-group versus between-group switches is the necessity to shift attention, the difference in switch cost between the two conditions will give us a measure for the cost of deploying top-down attention. 
Methods
Subjects
Six right-handed subjects (one female, five males) with normal or corrected-to-normal vision participated in the experiments, including one of the authors (D.B.W.). Subjects (ages 20–29, average 23) were recruited from Caltech's academic community and were paid for their participation. All subjects passed the Ishihara screening test for color vision without error and gave written informed consent. 
Apparatus
Stimuli were presented on a 20-in. Dell Trinitron CRT monitor (1,024 × 768 pixels, 3 × 8 bit RGB) at a refresh rate of 120 Hz. The display was synchronized with the vertical retrace of the monitor. Stimulus presentation and recording of the subjects' response were controlled with a Pentium 4 PC running Matlab R14 with the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997). Subjects were positioned approximately 100 cm from the computer screen. 
Stimuli
Stimuli consist of gray-level images of natural scenes surrounded by a colored frame ( Figure 1). The scenes were taken from a large commercially available CD-ROM library and from the World Wide Web (Li et al., 2002; Thorpe et al., 1996), allowing access to several thousand stimuli. The images were converted to 256 gray levels and rescaled to subtend an area of 4.4° × 6.6° of visual angle. Each image belongs to one of three classes: containing a clearly visible animal (e.g., birds, fish, mammals, insects), containing a clearly visible means of transport (e.g., trains, cars, airplanes, bicycles), or containing neither (distracter images); see Figure 1 for examples. We used more than 1,000 natural scenes of each of these classes, and each image was presented at most twice during the test sessions. 
Figure 1
 
Example stimuli for means of transport (A), animals (B), and distracters (C) as well as example masks (D). Masks are created by superimposing a naturalistic texture on a mixture of white noise at different spatial frequencies (Li et al., 2002), surrounded by a frame with broken-up segments of orange, blue, and purple. In this figure, the thickness of the color frames is exaggerated threefold for illustration.
Figure 1
 
Example stimuli for means of transport (A), animals (B), and distracters (C) as well as example masks (D). Masks are created by superimposing a naturalistic texture on a mixture of white noise at different spatial frequencies (Li et al., 2002), surrounded by a frame with broken-up segments of orange, blue, and purple. In this figure, the thickness of the color frames is exaggerated threefold for illustration.
A 2.7′ (2 pixels) thick orange, blue, or purple frame surrounds the gray-level images. At full saturation, the CIE ( r, g) coordinates of the colors are (0.666, 0.334), (0, 0), and (0.572, 0), respectively. The purple “distracter” color was chosen such that its hue component (0.875) in HSV space is equidistant from the hue components of the two “target” colors orange (0.083) and blue (0.667). The brightness (value) of all three colors was adjusted for perceptual equiluminance, using a technique based on minimizing flicker between colors at 14 Hz (Wagner & Boynton, 1972). During training, the saturation of the colors was decreased to make the task more difficult. The brightness was adjusted for each saturation level such that perceptual equiluminance between all three colors was maintained. Typically, saturation was decreased to about 0.15 during training, which corresponds to CIE (r, g) coordinates of (0.360, 0.333) for orange, (0.315, 0.315) for blue, and (0.355, 0.303) for purple. 
Experimental paradigm
The design of our stimuli allows us to define two groups of two tasks each that are as unrelated as possible, while still coinciding spatially ( Figure 2). The first group of tasks (IMG tasks) consists of detecting whether an animal is present in the image (cued by the word “animal”) or whether a means of transport is present (cued by the word “transport”). This has been shown to be possible without color information (Biederman & Ju, 1988; Delorme, Richard, & Fabre-Thorpe, 2000; Fei-Fei, VanRullen, Koch, & Perona, 2005). Staining the grayscale image with a color, on the other hand, does interfere with perception (Oliva & Schyns, 2000). For this reason, we decided to add a color frame to the image rather than adding the color for the color detection task to the image itself. 
Figure 2
 
The four tasks used in our experiments, here represented by the words used to cue subjects. The animal and vehicle detection tasks form the IMG task group because both relate to the natural scene photograph. Relating to the color frame around the image, the orange and the blue detection tasks make up the COL task group. In mixed-task blocks, any two of these four tasks are interleaved. Thus, we distinguish within-group switches (e.g., from “animal” to “transport”) from between-group switches (e.g., from “blue” to “animal”). Comparing switch costs between these two kinds of switches enables us to determine the cost of shifting attention to a different stimulus attribute.
Figure 2
 
The four tasks used in our experiments, here represented by the words used to cue subjects. The animal and vehicle detection tasks form the IMG task group because both relate to the natural scene photograph. Relating to the color frame around the image, the orange and the blue detection tasks make up the COL task group. In mixed-task blocks, any two of these four tasks are interleaved. Thus, we distinguish within-group switches (e.g., from “animal” to “transport”) from between-group switches (e.g., from “blue” to “animal”). Comparing switch costs between these two kinds of switches enables us to determine the cost of shifting attention to a different stimulus attribute.
The second task group (COL tasks) relates to the color of the frame, namely, detecting an orange frame (“orange”) or a blue frame (“blue”) around the image. Stimuli are displayed at a random location with fixed eccentricity from fixation to avoid spatial attention effects. Despite the randomization of the location of the stimulus on the screen during the experiment, having the color present around rather than within the image means that shifting attention from the image to the frame, for instance, still has a spatial component. Because of the randomization of the position, however, this is not spatial attention that subjects could prepare for by enhancing a particular retinotopic location. Rather, the relative spatial location of the relevant attribute within the stimulus is one of the properties that subjects attend to when successfully cued. 
The stimulus attributes and tasks were chosen in a way that makes them as dissimilar as possible. IMG tasks are based on luminance information only, and they relate to high-level content of the images. COL tasks only rely on chromaticity and require a relatively low-level detection of particular hues. In selecting stimulus attributes that are processed in so different, almost complementary ways, we aim at maximizing the cost for shifting attention between them. 
To compare RTs in situations with and without task-set switching, two kinds of blocks are used, single-task blocks (48 trials) and mixed-task blocks (96 trials). In single-task blocks, the task for the entire block is indicated by an instruction screen preceding the block. For mixed-task blocks, subjects are instructed to solve two out of the four possible tasks, either the two IMG tasks, or the two COL tasks, or one task from each group. Within each mixed-task block, equal numbers of trials for the two tasks are shuffled randomly to make their order unpredictable. This procedure results in a statistically equal number of trials with the same task as the preceding trial (repeat) and with the other task (switch). The task for each trial is indicated by a word cue presented at the center of the screen at cue-target intervals (CTIs) of 50, 200, and 800 ms before target onset. One CTI is used throughout a given block. For consistency, word cues are presented in both types of blocks, although they serve no purpose in single-task blocks. Figure 3 shows the time course of a typical trial. 
Figure 3
 
Time course of a trial, starting 1,300 ms before target onset with a blank gray screen. At 650 ± 25 ms before target onset, a white fixation dot (4.1′ × 4.1′) is presented at the center of the display. At a variable cue-target interval (CTI) before target onset, a word cue (0.5° high, between 1.1° and 2.5° wide) appears at the center of the screen for 17 ms (two frames), temporarily replacing the fixation dot for CTIs less than 650 ms. At 0 ms, the target stimulus, consisting of a gray-level photograph and a color frame around it, is presented at a random position on a circle around the fixation dot, such that the image is centered around 6.4° eccentricity. After a stimulus onset asynchrony (SOA) of 200–242 ms, the target stimulus is replaced with a perceptual mask. The mask is presented for 500 ms, followed by 1,000 ms of blank gray screen to allow the subjects to respond. In case of an error, acoustic feedback is given (pure tone at 800 Hz for 100 ms), followed by 100 ms of silence. After this, the next trial commences.
Figure 3
 
Time course of a trial, starting 1,300 ms before target onset with a blank gray screen. At 650 ± 25 ms before target onset, a white fixation dot (4.1′ × 4.1′) is presented at the center of the display. At a variable cue-target interval (CTI) before target onset, a word cue (0.5° high, between 1.1° and 2.5° wide) appears at the center of the screen for 17 ms (two frames), temporarily replacing the fixation dot for CTIs less than 650 ms. At 0 ms, the target stimulus, consisting of a gray-level photograph and a color frame around it, is presented at a random position on a circle around the fixation dot, such that the image is centered around 6.4° eccentricity. After a stimulus onset asynchrony (SOA) of 200–242 ms, the target stimulus is replaced with a perceptual mask. The mask is presented for 500 ms, followed by 1,000 ms of blank gray screen to allow the subjects to respond. In case of an error, acoustic feedback is given (pure tone at 800 Hz for 100 ms), followed by 100 ms of silence. After this, the next trial commences.
On each trial, the probability of seeing a positive (i.e., as cued) example is 50%, the probability for an example of the noncued class is 25%, and the probability for a distracter (nontarget) example is 25%, as illustrated in Table 1. The probabilities for the target frame colors orange and blue and the distracter color purple are distributed in an analogous manner. 
Table 1
 
Stimulus probabilities depending on task.
Table 1
 
Stimulus probabilities depending on task.
Task Animal images Transport images Distracter images
Animal 50% 25% 25%
Transport 25% 50% 25%
Subjects were instructed to hold the left mouse button pressed with the index finger of their right hand throughout the block, and to only briefly release it as soon as they detect the cued target property for a given trial. If no response is given within 1,500 ms of mask onset, a negative response is assumed (speeded go/no-go response). RT is measured for correct positive responses as the time passed between the onset of the target stimulus and the registration of the mouse button release event. If subjects give a positive response, that is, release the mouse button, the 1,000-ms waiting period after the mask is cut short. In case of an error, acoustic feedback is given. 
Subjects were trained on single-task blocks for 2–3 hr (40–60 blocks of 48 trials). For each image class, a randomly chosen subset of 80 images was set aside and reused repeatedly for training, but not for testing. During training, the stimulus onset asynchrony (SOA) was adjusted in a staircase procedure based on the performance in IMG blocks, starting with an initial 400 ms. A stable target performance between 88% and 92% correct was achieved with SOAs between 200 and 242 ms (average 214 ms). The same SOA as for IMG was also used for COL blocks. To achieve the same level of difficulty, we decreased the saturation of the colors in a staircase procedure with a logarithmic scale, starting with 1 down to between 0.098 and 0.185 (average 0.151). At the end of training, SOA and saturation were fixed for each subject. 
The seven 1-hr test sessions for each subject consisted of 10 mixed-task blocks (96 trials) interleaved with five single-task blocks (48 trials). All positive IMG trials were done with images that the subjects had seen at most once before, thus avoiding overtraining on individual images. The order of blocks with different CTIs and task combinations was randomized within each session and counterbalanced across sessions. 
Data analysis
RT was recorded for correct positive trials. After discarding the first trial of each block, trials with an RT more than four standard deviations above the mean (above 1,108 ms) or below 200 ms were discarded as outliers (1% of the data). Error rates and RTs were pooled separately for switch and repeat trials in mixed-task blocks and over all trials in single-task blocks. These block results were pooled separately for each CTI value, and, for some analyses, for each task group and switch condition over all sessions for all six subjects (42 sessions in total), and the standard error of the mean ( SEM) was computed. 
Mixing and switch costs are computed as  
C m i x = R T r e p e a t R T s i n g l e C s w i t c h = R T s w i t c h R T r e p e a t ,
(1)
where 〈·〉 denotes the mean over sessions and subjects. Their standard errors ( SE) are derived as follows:  
S E ( C m i x ) = var ( R T r e p e a t ) N r e p e a t + var ( R T s i n g l e ) N s i n g l e S E ( C s w i t c h ) = var ( R T s w i t c h ) N s w i t c h + var ( R T r e p e a t ) N r e p e a t .
(2)
Mixing and switch costs are analyzed using N-way analyses of variance (ANOVAs). The significance of specific mixing and switch costs is determined by testing whether the two constituent RT samples are drawn from populations with different means, using an unmatched t test. We mark alpha levels of 0.05 (*), 0.01 (**), and 0.001 (***) in figures and tables. 
Results
Figure 4 shows RTs for single-task blocks, repeat trials, and switch trials in mixed-task blocks for the three values of CTI. For single-task blocks, RT is independent of CTI. RTs for task-repeat and task-switch trials are the same as for single-task blocks for CTI values 200 and 800 ms but differ from single-task blocks for 50 ms, giving rise to mixing and switch cost as defined in Equation 1
Figure 4
 
Reaction times (RTs) for single-task blocks, task-repeat trials, and task-switch trials in mixed-task blocks for n = 6 subjects. Error bars are SEM. Both mixing and switch costs are significant for a CTI of 50 ms, but not for cue-target intervals (CTIs) of 200 and 800 ms. The drop of the single-task RT at 200 ms compared to 50 and 800 ms is not significant ( p > 0.05, t test).
Figure 4
 
Reaction times (RTs) for single-task blocks, task-repeat trials, and task-switch trials in mixed-task blocks for n = 6 subjects. Error bars are SEM. Both mixing and switch costs are significant for a CTI of 50 ms, but not for cue-target intervals (CTIs) of 200 and 800 ms. The drop of the single-task RT at 200 ms compared to 50 and 800 ms is not significant ( p > 0.05, t test).
An ANOVA of mixing cost reveals significant main effects for the factors CTI, task group, and subject identity ( Table 2). None of the two-way interactions are significant. 
Table 2
 
Three-way analysis of variance (ANOVA) for mixing cost in reaction time at cue-target interval (CTI) = 50 ms.
Table 2
 
Three-way analysis of variance (ANOVA) for mixing cost in reaction time at cue-target interval (CTI) = 50 ms.
Source df Mean square F p
CTI 2 5,616 18.55 0.0004 ***
Task group 1 1,730 5.71 0.038 *
Subject identity 5 2,407 7.95 0.003 **
The values for mixing cost for the two task groups and the three CTI values are given in Table 3. There is significant mixing cost for both task groups for CTI = 50 ms. At a CTI of 200 ms, mixing cost is barely significant for IMG, but not for COL tasks. There is no mixing cost for CTI = 800 ms. Significance was determined with t tests. 
Table 3
 
Reaction time mixing cost (in ms) by task group for the three values of the cue-target interval (CTI). Standard errors are given in brackets.
Table 3
 
Reaction time mixing cost (in ms) by task group for the three values of the cue-target interval (CTI). Standard errors are given in brackets.
Task group 50 ms 200 ms 800 ms
IMG 62.8 (11.5)*** 29.7 (11.9)* 9.7 (10.5)
COL 39.1 (12.8)** 13.1 (14.4) 4.2 (12.9)
Over all 49.0 (8.6)*** 19.7 (9.2) 3.6 (8.3)
For analyzing switch cost, we use switch condition (within or between task groups) as a fourth variable for the ANOVA ( Table 4). We obtain a significant effect for it as well as for CTI, but not for task group or subject identity. None of the six possible two-way interactions are significant. 
Table 4
 
Four-way analysis of variance (ANOVA) for switch cost in reaction time.
Table 4
 
Four-way analysis of variance (ANOVA) for switch cost in reaction time.
Source df Mean square F p
CTI 2 1,559 6.01 0.006 **
Task group 1 140 0.54 0.47 ns
Subject identity 5 509 1.96 0.11 ns
Switch condition 1 4,390 16.93 0.0002 ***
Table 5shows the values of switch cost by switch condition. Switch cost is only significant at CTI = 50 ms for switching between COL and IMG tasks, but not for switching within a task group. There is no switch cost for CTIs of 200 or 800 ms. As before, significance was tested with ttests. 
Table 5
 
Switch cost (in ms) by switch condition for the three cue-target interval (CTI) values. Standard errors are given in brackets.
Table 5
 
Switch cost (in ms) by switch condition for the three cue-target interval (CTI) values. Standard errors are given in brackets.
Switch condition 50 ms 200 ms 800 ms
IMG→IMG 0.3 (11.5) −0.5 (10.4) 9.3 (11.4)
COL→COL 7.7 (11.6) −4.3 (13.5) −12.8 (10.8)
COL→IMG 23.5 (12.0)* 11.2 (11.6) 0.8 (10.6)
IMG→COL 30.7 (11.5)** 11.7 (12.1) −1.1 (10.7)
Over all 16.1 (6.9)* 4.6 (7.0) −1.2 (6.5)
Figure 5illustrates the dependence of RT switch cost on switch condition for CTI = 50 ms. 
Figure 5
 
Switch cost in reaction time (RT) at a cue-target interval (CTI) of 50 ms is only significant for between-group switches (red) but not for within-group switches (blue). Error bars are SEas defined in Equation 2. For an illustration of between- and within-group switches, see Figure 2. Note that the switch cost of 16.1 ms in Figure 4arises from pooling the data over the four switch conditions that are shown individually in this figure.
Figure 5
 
Switch cost in reaction time (RT) at a cue-target interval (CTI) of 50 ms is only significant for between-group switches (red) but not for within-group switches (blue). Error bars are SEas defined in Equation 2. For an illustration of between- and within-group switches, see Figure 2. Note that the switch cost of 16.1 ms in Figure 4arises from pooling the data over the four switch conditions that are shown individually in this figure.
The error rate over all conditions is 10.7 ± 7.4%, conforming to the training goal of 88–92% correct performance. About 10% of all blocks are free of errors. Mean error rates for the individual conditions (CTI and trial type) vary between 9.1% and 12.8%. There are no systematic effects in error rate. This finding is in agreement with a similar study by Slagter et al. (2006), who also found no switch cost in error rate. 
Discussion
We set out to find the difference in switch cost of between- and within-group task-set switches to shed light on the cost of shifting attention across different stimulus attributes. We did indeed find significant switch costs of 23 ± 12 ms (COL to IMG) and 31 ± 12 ms (IMG to COL) for between-group switches, but no significant switch cost for within-group (IMG to IMG or COL to COL) switches. 
Decision strategy cannot explain the results
It might be suggested that differences in decision strategy rather than shifting attention cause the difference in switch cost. Decision strategies might differ between subjects, or the same subject might resort to different strategies when confronted with single-task blocks compared to mixed-task blocks. 
RT differences resulting from differences in decision strategy between single-task and mixed-task blocks are contained in mixing cost, which represents the contrast between single-task blocks and repeat trials in mixed-task blocks. Switch cost, on the other hand, represents contrasts between switch and repeat trials within mixed-task blocks. Because of the randomization of the trials within mixed-task blocks, no advance preparation for the type of trial is possible before cue onset. Therefore, we are confident that the reported differences in switch cost are due to shifts in attention to the other stimulus attribute rather than differences in decision strategy. 
Interestingly, individual differences between subjects are also contained in the mixing cost (see Table 2), whereas switch cost does not depend on subject identity (see Table 4). 
The same is true for remaining RT differences between the two task groups despite our attempts to equalize task difficulty. Mixing cost for IMG (63 ± 12 ms) is larger than for COL (39 ± 13 ms) tasks. However, such a difference is not present in switch cost (see Table 4). The paradigm of distinguishing mixing and switch costs allows us to catch all these variations in mixing cost while keeping switch cost unaffected by them. 
Mechanisms of task-set switching
Both mixing and switch costs are significant only for a CTI of 50 ms, but not for 200 ms or longer in our study. This agrees with the findings of Wolfe et al. (2004) that cueing becomes fully effective within 200 ms from cue onset in their visual search paradigm. This means that a CTI of 200 ms is sufficient for perceiving the cue and shifting attention to the cued stimulus attribute without incurring an RT penalty compared with perceiving the cue and not having to shift attention. Thus, 200 ms is an upper bound on the time it takes to shift attention. Presenting the cue with a CTI of 50 ms allows subjects to perform the task (no significant switch cost in error rate), but at a penalty of 23–31 ms in RT if an attention shift is required. 
What happened to the other contributors to switch cost, in particular remapping of the motor response? The lack of a significant within-group switch cost seems to suggest that there is no significant cost for motor remapping, which appears to contradict the results of Meiran (2000), who found significant contributions of both stimulus and response switching to the switch cost. This discrepancy may be explained by differences in the experimental design. Meiran as well as most task switching studies (e.g., De Jong, 2000; Kleinsorge, 2004; Meiran, 1996; Rogers & Monsell, 1995) use a two-alternative forced choice design, typically instructing subjects to operate the two keys with different hands and thus requiring coordination of the motor response across both hemispheres of the brain. Compared to these designs, our go/no-go response by releasing a mouse button with only the right hand is rather simple and may not require as much time to be reassigned, thus accounting for the absence of switch cost for within-group switches. 
Traditionally, switch cost is attributed to two mechanisms (Allport & Wylie, 1999; Sohn & Anderson, 2001). The first one is task inertia, a leftover representation of the previous task set, which prevents the new task set from being loaded (Allport et al., 1994; Gilbert & Shallice, 2002; Wylie & Allport, 2000). Task inertia manifests itself in decreasing switch cost with increasing preparation time. In our present study, we do indeed see significant switch cost for short CTI, but no switch cost for long CTIs. 
The second mechanism thought to be responsible for switch cost is an incomplete reconfiguration of the new task set before the onset of the relevant stimulus. This component does not depend on preparation time and is observable as a residual cost even at long intertrial or cue-target intervals (Altmann, 2004a, 2004b; De Jong, 2000; Kimberg et al., 2000; Nieuwenhuis & Monsell, 2002; Sohn et al., 2000). However, the presence of residual switch cost depends on the details of the experimental paradigm (Rogers & Monsell, 1995; Stoet & Snyder, 2003). In our study, we found no residual switch cost for long CTIs. 
A third interpretation of switch cost has been proposed by Logan et al. (Logan & Bundesen, 2003; Schneider & Logan, 2005). Logan and Bundesen (2003) point out that usually each task is associated with one particular cue. Hence, switch cost could be mostly due to encoding the new cue and only to a small extent, if at all, due to task-set reconfiguration. This account has been disputed by Monsell and Mizon (2006), only to be reaffirmed recently by Logan, Schneider, and Bundesen (2007). 
Although the outcome of this heated debate has important ramifications for understanding task-set switching in general, it does not affect the interpretation of our results. In our present experiment, we contrast switch cost for within-attribute switches with switch cost for between-attribute switches. Whatever cost is incurred by cue encoding in switch trials compared to repeat trials is incurred in both of these switch scenarios. In all switch trials, the word cue needs to be encoded and interpreted, and the new task set needs to be activated. 
The only difference between these two switch modes is that between-group switching requires the shift of attention to another stimulus attribute, whereas within-group switching does not. Stimulus attributes were chosen to be as dissimilar as possible to maximize this effect. 
By the same argument, we can also exclude reallocation of other cognitive resources such as working memory as the cause of the observed differences in RT cost. These resources need to be recruited equally in within- and between-group switches. We conclude that the RT cost of shifting attention in our fast detection paradigm is 23–31 ms. 
Neural correlates of task-set switching
Neural correlates for both switch and mixing cost have been found in human lateral prefrontal cortex (often lateralized to the right hemisphere) and parietal areas (often lateralized to the left hemisphere) in a number of fMRI (Barber & Carter, 2005; Brass & von Cramon, 2004; Braver, Reynolds, & Donaldson, 2003; Crone, Wendelken, Donohue, & Bunge, 2006; Dove, Pollmann, Schubert, Wiggins, & von Cramon, 2000; Gruber, Karch, Schlueter, Falkai, & Goschke, 2006; Kimberg et al., 2000; Smith, Taylor, Brammer, & Rubia, 2004; Sohn et al., 2000; Swainson et al., 2003; Yeung, Nystrom, Aronson, & Cohen, 2006) and lesion studies (Aron, Monsell, Sahakian, & Robbins, 2004). These brain networks have been implied in cognitive control of task reconfiguration (Shulman, d'Avossa, Tansy, & Corbetta, 2002; Swainson et al., 2003). 
Task-set switching has been explored with a variety of tasks and stimuli, such as determining whether a letter is a vowel or a consonant, whether a digit is odd or even, whether a digit is above or below 5, and whether a word describes an animal, Stroop word/color naming tasks, distinguishing “+” from “−.” Most tasks are designed to test the cognitive aspects of interpreting them. Few studies consider the perceptual aspects of switching between visual attributes of stimuli. 
Shulman et al. (2002) used color and motion detection tasks in an fMRI study of task-set switching. They found activity in left posterior parietal cortex and left frontal cortex correlated with task specification and preparation independent of the relevant stimulus attribute (motion or color). In addition, motion- and color-specific signals were also observed in left parietal cortex, whereas task-specific areas, such as MT+ for motion, showed no modulation due to relevance of its preferred stimulus attribute for the task. 
Parietal and MT+ activations were also found by Liu, Slotnick, Serences, and Yantis (2003) with the cue embedded into a continuous rapid serial stream of color and motion stimuli. In their paradigm, Liu et al. did not find frontal activation. Furthermore, they did not find any RT cost for switching between color and motion tasks. Both findings may be due to the requirement for subjects to hold two potential tasks in mind during these experiments so that no actual set switching might be necessary (Liu et al., 2003). 
Crone et al. (2006) used images of object categories in their fMRI study of task switching, but they focused on rule representation in the brain rather than visual perception. Yeung et al. (2006) used the known sensitivity of areas in prefrontal and occipital cortex to faces and words to show that increased fMRI activity in the irrelevant area during switching is correlated with increased RT. Their stimuli were not designed to compare within- and between-category switches. 
In their recent ERP (Rushworth, Passingham, & Nobre, 2005) and fMRI (Rushworth, Paus, & Sipila, 2001) studies, Rushworth et al. used a design similar to ours, where they required subjects to pay attention to the color or the shape of one of two presented stimuli to detect a rare target. Although they did find significant switch cost in RT, they only considered switches between stimulus attributes, and they did not compare them with switches within attributes. 
Mechanisms of attention
Attention is frequently classified as either spatial, feature based, or object based. Of what nature then is the shift of attention between the stimulus attributes in our experiments? 
We minimize potential contributions of spatial attention by randomizing the location of the stimulus from trial to trial while keeping eccentricity fixed, thus making the position of the stimulus, and hence its attributes, unpredictable. Once the stimulus location is known to subjects after stimulus onset, subjects may use the spatial location of the image versus the frame around the image as one of the properties of the attributes IMG or COL, in addition to hue content, spatial frequency content, etc. Subjects cannot bias their spatial attention according to the cued task from the top down before stimulus onset, however, because the location of the relevant attribute is relative to the (then unknown) position of the stimulus on the screen. 
The concept of object-based attention can refer to attending to the space occupied by an object (Egly, Driver, & Rafal, 1994; Moore, Yantis, & Vaughan, 1998), or it can refer to nonspatial attention to, for instance, overlapping objects (Duncan, 1984; O'Craven, Downing, & Kanwisher, 1999; Reynolds, Alborzian, & Stoner, 2003; Roelfsema, Lamme, & Spekreijse, 1998). In the latter case, objects are defined by high-level interpretation, such as object category or identity, or by low-level features, such as coherent motion or color. 
It could be argued that in our experiments the image and the color frame constitute two overlapping objects defined by their respective features. On the other hand, because they cooccur in a very consistent manner, the colored frame and the grayscale image could also be seen as two attributes or features of the same object (image with frame). In this case, the attention shift observed in our experiments could be classified as feature-based attention. Our experiments do not address the fine distinction between feature-based attention and attention to overlapping objects based on their features. 
In a design similar to ours, Slagter et al. (2006) found no difference in switch cost for within- versus between-attribute switches. In these experiments, subjects were cued endogenously to attend to a rectangle at a particular off-center location or to a rectangle of a particular color at fixation and report its orientation. 
A main difference between this and our study is that Slagter et al. (2006) require their subjects to shift attention to another object (one of the four rectangles in their display) in within- as well as in between-attribute switches. In our study, by contrast, attention was shifted to another attribute (frame color versus image content) within the object (image with frame). Therefore, the switch cost that Slagter et al. observe in their experiments may be dominated by the cost of shifting attention to another object (defined by location and/or color), whereas the switch cost observed in our experiments is due to shifting attention to another attribute of the object. 
This would also explain why Slagter et al. (2006) find significant switch costs for a CTI of 1,500 ms, whereas we fail to find switch cost for CTIs as short as 200 ms. In the object file metaphor by Kahneman and Treisman (1984), shifting attention to another object requires closing the old object file and opening another one. Shifting attention to another attribute, on the other hand, corresponds to accessing another entry within the same object file, which may require much less preparatory time than accessing a new file. 
In the fMRI part of their study, Slagter et al. (2006) implicate parietal areas (both intraparietal sulci and the right precuneus) as well as premotor areas in the difference between within- and between-attribute switches. In the discussion of their results, they mainly focus on control and response selection issues. In future fMRI studies, it would be interesting to investigate the perceptual aspects of shifting attention to another object versus shifting attention to another attribute of the same object to see if the deviating accounts of switch cost can be corroborated by differential brain activity. 
Implications for visual perception
To our knowledge, our study is the first to arrive at a measure for shifting attention between visual attributes of a stimulus by comparing switch cost in within- and between-attribute switches using natural scenes (but for their similar study using colored rectangles, see Slagter et al., 2006). Given the large body of previous work in task-set switching, it is hardly surprising that we do find significant switch costs in our experiments. The comparison between one switch condition that requires shifting attention to the other stimulus attribute and one that does not require this kind of attention shift pins the measured difference in switch cost on the cost for the attention shift. 
What do our results mean for the top-down control of visual perception? Due to its short processing time, fast object detection in natural scenes of the sort shown by Thorpe et al. (1996) is assumed to be possible in a purely feed-forward, hierarchical model of the ventral pathway (Riesenhuber & Poggio, 1999; Serre, Oliva, & Poggio, 2007; Thorpe, Delorme, & Van Rullen, 2001). It was demonstrated by Li et al. (2002) that this is still possible when spatial attention is tied to a low-level search task in a dual-task paradigm. Our results show that switching from a low-level color detection task to a natural scene categorization task (and vice versa) incurs an RT cost that is not present when switching within stimulus attributes. 
Switching top-down attention to a different feature value within the same stimulus attribute requires biasing feed-forward connections in such a hierarchical system at fairly high levels, for example, in inferotemporal cortex or even the connections from inferotemporal cortex to prefrontal cortex for object categories (Freedman, Riesenhuber, Poggio, & Miller, 2003). For example, two different classifiers, possibly located in the prefrontal cortex, would access the same data in inferotemporal cortex to decide whether an animal or a vehicle is present in the image (Hung, Kreiman, Poggio, & DiCarlo, 2005). Thus, the effect of top-down attention could be interpreted as switching one set of synaptic weights for another one. 
When switching to a different stimulus attribute, processed by a different visual area, one would assume that task-specific biasing of neural activity has to happen at an earlier stage in the hierarchy before specialization of processing streams takes place. In the case of switching between color and object detection, this could be area V4, V2, or even V1. Our current results indicate a higher cost in RT for task switches between attributes, that is, for biasing at an earlier stage, than within attributes, that is, biasing at a later, more specialized stage. This finding agrees with ideas of a reverse hierarchy put forward by Hochstein and Ahissar (2002). 
On the other hand, it is also well possible that the difference in switch costs has nothing to do with different locations in the visual processing hierarchy but with the ease of activating and reading out activity from nearby neurons, encoding similar attributes (e.g., one color versus another), and more far-flung neuronal populations that are encoding quite different stimulus attributes (e.g., color versus object category). 
Acknowledgments
We wish to thank Christof Koch, Shinsuke Shimojo, Farshad Moradi, Rufin van Rullen, and Diane Beck for insightful discussions. Lisa Fukui collaborated on early pilot studies. This project was funded by the NSF Engineering Research Center for Neuromorphic Systems Engineering at Caltech, a Sloan-Swartz Pre-doctoral Fellowship and a Beckman Postdoctoral Fellowship to D.B.W., and a Microsoft Research New Faculty Fellowship to L.F.F. 
Commercial relationships: none. 
Corresponding Author: Dirk B. Walther. 
Address: Beckman Institute, University of Illinois at Urbana-Champaign, Urbana, IL, USA. 
References
Allport, A. Styles, E. A. Hsieh, S. L. Umilta,, C. Moscovitch, M. (1994). Shifting intentional set—Exploring the dynamic control of tasks. Attention and performance XV. (15, pp. 421–452). Cambridge, MA: MIT Press.
Allport, A. Wylie, G. Humphreys,, G. W. Duncan,, J. Treisman, A. M. (1999). Task-switching: Positive and negative priming of task-set. Attention, space and action: Studies in cognitive neuroscience. (pp. 273–296). Oxford, England: Oxford University Press.
Altmann, E. M. (2004a). Advance preparation in task switching: What work is being done? Psychological Science, 15, 616–622. [PubMed] [CrossRef]
Altmann, E. M. (2004b). The preparation effect in task switching: Carryover of SOA. Memory & Cognition, 32, 153–163. [PubMed] [CrossRef]
Aron, A. R. Monsell, S. Sahakian, B. J. Robbins, T. W. (2004). A componential analysis of task-switching deficits associated with lesions of left and right frontal cortex. Brain, 127, 1561–1573. [PubMed] [Article] [CrossRef] [PubMed]
Barber, A. D. Carter, C. S. (2005). Cognitive control involved in overcoming prepotent response tendencies and switching between tasks. Cerebral Cortex, 15, 899–912. [PubMed] [Article] [CrossRef] [PubMed]
Biederman, I. Ju, G. (1988). Surface versus edge-based determinants of visual recognition. Cognitive Psychology, 20, 38–64. [PubMed] [CrossRef] [PubMed]
Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision, 10, 433–436. [PubMed] [CrossRef] [PubMed]
Brass, M. von Cramon, D. Y. (2004). Decomposing components of task preparation with functional magnetic resonance imaging. Journal of Cognitive Neuroscience, 16, 609–620. [PubMed] [CrossRef] [PubMed]
Braver, T. S. Reynolds, J. R. Donaldson, D. I. (2003). Neural mechanisms of transient and sustained cognitive control during task switching. Neuron, 39, 713–726. [PubMed] [Article] [CrossRef] [PubMed]
Crone, E. A. Wendelken, C. Donohue, S. E. Bunge, S. A. (2006). Neural evidence for dissociable components of task-switching. Cerebral Cortex, 16, 475–486. [PubMed] [Article] [CrossRef] [PubMed]
De Jong, R. Monsell, S. Driver, J. (2000). An intention–activation account of residual switch costs. Control of cognitive processes: Attention and Performance XVIII. (pp. 357–376). Cambridge, MA: MIT Press.
Delorme, A. Richard, G. Fabre-Thorpe, M. (2000). Ultra-rapid categorisation of natural scenes does not rely on colour cues: A study in monkeys and humans. Vision Research, 40, 2187–2200. [PubMed] [CrossRef] [PubMed]
Dove, A. Pollmann, S. Schubert, T. Wiggins, C. J. von Cramon, D. Y. (2000). Prefrontal cortex activation in task switching: An event-related fMRI study. Cognitive Brain Research, 9, 103–109. [PubMed] [CrossRef] [PubMed]
Duncan, J. (1984). Selective attention and the organization of visual information. Journal of Experimental Psychology: General, 113, 501–517. [PubMed] [CrossRef] [PubMed]
Egly, R. Driver, J. Rafal, R. D. (1994). Shifting visual attention between objects and locations: Evidence from normal and parietal lesion subjects. Journal of Experimental Psychology: General, 123, 161–177. [PubMed] [CrossRef] [PubMed]
Fei-Fei, L. VanRullen, R. Koch, C. Perona, P. (2005). Why does natural scene categorization require little attention Exploring attentional requirements for natural and synthetic stimuli. Visual Cognition, 12, 893–924. [CrossRef]
Freedman, D. J. Riesenhuber, M. Poggio, T. Miller, E. K. (2003). A comparison of primate prefrontal and inferior temporal cortices during visual categorization. Journal of Neuroscience, 23, 5235–5246. [PubMed] [Article] [PubMed]
Gilbert, S. J. Shallice, T. (2002). Task switching: A PDP model. Cognitive Psychology, 44, 297–337. [PubMed] [CrossRef] [PubMed]
Gruber, O. Karch, S. Schlueter, E. K. Falkai, P. Goschke, T. (2006). Neural mechanisms of advance preparation in task switching. Neuroimage, 31, 887–895. [PubMed] [CrossRef] [PubMed]
Hochstein, S. Ahissar, M. (2002). View from the top: Hierarchies and reverse hierarchies in the visual system. Neuron, 36, 791–804. [PubMed] [Article] [CrossRef] [PubMed]
Hung, C. P. Kreiman, G. Poggio, T. DiCarlo, J. J. (2005). Fast readout of object identity from macaque inferior temporal cortex. Science, 310, 863–866. [PubMed] [CrossRef] [PubMed]
Itti, L. Koch, C. (2001). Computational modelling of visual attention. Nature Reviews, Neuroscience, 2, 194–203. [PubMed] [CrossRef]
Jersild, A. T. (1927). Mental set and shift. Archives of Psychology, 89, 5–82.
Kahneman, D. Treisman, A. Parasuraman,, R. Davies,, R. Beatty, J. (1984). Changing views of attention and automaticity. Varieties of attention. (pp. 29–61). New York: Academic Press.
Kimberg, D. Y. Aguirre, G. K. D'Esposito, M. (2000). Modulation of task-related neural activity in task-switching: An fMRI study. Cognitive Brain Research, 10, 189–196. [PubMed] [CrossRef] [PubMed]
Kleinsorge, T. (2004). Hierarchical switching with two types of judgment and two stimulus dimensions. Experimental Psychology, 51, 145–149. [PubMed] [CrossRef] [PubMed]
Koch, I. (2005). Sequential task predictability in task switching. Psychonomic Bulletin & Review, 12, 107–112. [PubMed] [CrossRef] [PubMed]
Li, F. F. VanRullen, R. Koch, C. Perona, P. (2002). Rapid natural scene categorization in the near absence of attention. Proceedings of the National Academy of Sciences of the United States of America, 99, 9596–9601. [PubMed] [Article] [CrossRef] [PubMed]
Liu, T. Slotnick, S. D. Serences, J. T. Yantis, S. (2003). Cortical mechanisms of feature-based attentional control. Cerebral Cortex, 13, 1334–1343. [PubMed] [Article] [CrossRef] [PubMed]
Logan, G. D. Bundesen, C. (2003). Clever homunculus: Is there an endogenous act of control in the explicit task-cuing procedure? Journal of Experimental Psychology: Human Perception and Performance, 29, 575–599. [PubMed] [CrossRef] [PubMed]
Logan, G. D. Schneider, D. W. Bundesen, C. (2007). Still clever after all these years: Searching for the homunculus in explicitly cued task switching. Journal of Experimental Psychology: Human Perception and Performance, 33, 978–994. [PubMed] [CrossRef] [PubMed]
Meiran, N. (1996). Reconfiguration of processing mode prior to task performance. Journal of Experimental Psychology: Learning Memory and Cognition, 22, 1423–1442. [CrossRef]
Meiran, N. (2000). Modeling cognitive control in task-switching. Psychological Research, 63, 234–249. [PubMed] [CrossRef] [PubMed]
Monsell, S. (2003). Task switching. Trends in Cognitive Sciences, 7, 134–140. [PubMed] [CrossRef] [PubMed]
Monsell, S. Mizon, G. A. (2006). Can the task-cuing paradigm measure an endogenous task-set reconfiguration process? Journal of Experimental Psychology: Human Perception and Performance, 32, 493–516. [PubMed] [CrossRef] [PubMed]
Moore, C. M. Yantis, S. Vaughan, B. (1998). Object-based visual selection: Evidence from perceptual completion. Psychological Science, 9, 104–110. [CrossRef]
Nieuwenhuis, S. Monsell, S. (2002). Residual costs in task switching: Testing the failure-to-engage hypothesis. Psychonomic Bulletin & Review, 9, 86–92. [PubMed] [Article] [CrossRef] [PubMed]
O'Craven, K. M. Downing, P. E. Kanwisher, N. (1999). fMRI evidence for objects as the units of attentional selection. Nature, 401, 584–587. [PubMed] [CrossRef] [PubMed]
Oliva, A. Schyns, P. G. (2000). Diagnostic colors mediate scene recognition. Cognitive Psychology, 41, 176–210. [PubMed] [CrossRef] [PubMed]
Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10, 437–442. [PubMed] [CrossRef] [PubMed]
Potter, M. C. Levy, E. I. (1969). Recognition memory for a rapid sequence of pictures. Journal of Experimental Psychology, 81, 10–15. [PubMed] [CrossRef] [PubMed]
Rensink, R. A. Oregan, J. K. Clark, J. J. (1997). To see or not to see: The need for attention to perceive changes in scenes. Psychological Science, 8, 368–373. [CrossRef]
Reynolds, J. H. Alborzian, S. Stoner, G. R. (2003). Exogenously cued attention triggers competitive selection of surfaces. Vision Research, 43, 59–66. [PubMed] [CrossRef] [PubMed]
Riesenhuber, M. Poggio, T. (1999). Hierarchical models of object recognition in cortex. Nature Neuroscience, 2, 1019–1025. [PubMed] [Article] [CrossRef] [PubMed]
Roelfsema, P. R. Lamme, V. A. Spekreijse, H. (1998). Object-based attention in the primary visual cortex of the macaque monkey. Nature, 395, 376–381. [PubMed] [CrossRef] [PubMed]
Rogers, R. D. Monsell, S. (1995). Costs of a Predictable switch between simple cognitive tasks. Journal of Experimental Psychology: General, 124, 207–231. [CrossRef]
Rushworth, M. F. Paus, T. Sipila, P. K. (2001). Attention systems and the organization of the human parietal cortex. Journal of Neuroscience, 21, 5262–5271. [PubMed] [Article] [PubMed]
Rushworth, M. F. Passingham, R. E. Nobre, A. C. (2005). Components of attentional set-switching. Experimental Psychology, 52, 83–98. [PubMed] [CrossRef] [PubMed]
Schneider, D. W. Logan, G. D. (2005). Modeling task switching without switching tasks: A short-term priming account of explicitly cued performance. Journal of Experimental Psychology: General, 134, 343–367. [PubMed] [CrossRef] [PubMed]
Serre, T. Oliva, A. Poggio, T. (2007). A feedforward architecture accounts for rapid categorization. Proceedings of the National Academy of Science of the United States of America, 104, 6424–6429. [PubMed] [Article] [CrossRef]
Shulman, G. L. d'Avossa, G. Tansy, A. P. Corbetta, M. (2002). Two attentional processes in the parietal lobe. Cerebral Cortex, 12, 1124–1131. [PubMed] [Article] [CrossRef] [PubMed]
Simons, D. J. Levin, D. T. (1998). Failure to detect changes to people during a real-world interaction. Psychonomic Bulletin & Review, 5, 644–649. [CrossRef]
Simons, D. J. Rensink, R. A. (2005). Change blindness: Past, present, and future. Trends in Cognitive Sciences, 9, 16–20. [PubMed] [CrossRef] [PubMed]
Slagter, H. A. Weissman, D. H. Giesbrecht, B. Kenemans, J. L. Mangun, G. R. Kok, A. (2006). Brain regions activated by endogenous preparatory set shifting as revealed by fMRI. Cognitive, Affective & Behavioral Neuroscience, 6, 175–189. [PubMed] [CrossRef] [PubMed]
Smith, A. B. Taylor, E. Brammer, M. Rubia, K. (2004). Neural correlates of switching set as measured in fast, event-related functional magnetic resonance imaging. Human Brain Mapping, 21, 247–256. [PubMed] [CrossRef] [PubMed]
Sohn, M. H. Anderson, J. R. (2001). Task preparation and task repetition: Two-component model of task switching. Journal of Experimental Psychology: General, 130, 764–778. [PubMed] [CrossRef] [PubMed]
Sohn, M. H. Ursu, S. Anderson, J. R. Stenger, V. A. Carter, C. S. (2000). Inaugural article: The role of prefrontal cortex and posterior parietal cortex in task switching. Proceedings of the National Academy of Sciences of the United States of America, 97, 13448–13453. [PubMed] [Article] [CrossRef] [PubMed]
Spector, A. Biederman, I. (1976). Mental set and mental shift revisited. American Journal of Psychology, 89, 669–679. [CrossRef]
Stoet, G. Snyder, L. H. (2003). Executive control and task-switching in monkeys. Neuropsychologia, 41, 1357–1364. [PubMed] [CrossRef] [PubMed]
Sudevan, P. Taylor, D. A. (1987). The cuing and priming of cognitive operations. Journal of Experimental Psychology: Human Perception and Performance, 13, 89–103. [PubMed] [CrossRef] [PubMed]
Swainson, R. Cunnington, R. Jackson, G. M. Rorden, C. Peters, A. M. Morris, P. G. (2003). Cognitive control mechanisms revealed by ERP and fMRI: Evidence from repeated task-switching. Journal of Cognitive Neuroscience, 15, 785–799. [PubMed] [CrossRef] [PubMed]
Thorpe, S. Delorme, A. Van Rullen, R. (2001). Spike-based strategies for rapid processing. Neural Networks, 14, 715–725. [PubMed] [CrossRef] [PubMed]
Thorpe, S. Fize, D. Marlot, C. (1996). Speed of processing in the human visual system. Nature, 381, 520–522. [PubMed] [CrossRef] [PubMed]
Treisman, A. M. Gelade, G. (1980). A feature-integration theory of attention. Cognitive Psychology, 12, 97–136. [PubMed] [CrossRef] [PubMed]
Wagner, G. Boynton, R. M. (1972). Comparison of four methods of heterochromatic photometry. Journal of the Optical Society of America, 62, 1508–1515. [PubMed] [CrossRef] [PubMed]
Wolfe, J. M. Horowitz, T. S. Kenner, N. Hyle, M. Vasan, N. (2004). How fast can you change your mind The speed of top-down guidance in visual search. Vision Research, 44, 1411–1426. [PubMed] [CrossRef] [PubMed]
Wylie, G. Allport, A. (2000). Task switching and the measurement of “switch costs”; Psychological Research, 63, 212–233. [PubMed] [CrossRef] [PubMed]
Yeung, N. Nystrom, L. E. Aronson, J. A. Cohen, J. D. (2006). Between-task competition and cognitive control in task switching. Journal of Neuroscience, 26, 1429–1438. [PubMed] [Article] [CrossRef] [PubMed]
Figure 1
 
Example stimuli for means of transport (A), animals (B), and distracters (C) as well as example masks (D). Masks are created by superimposing a naturalistic texture on a mixture of white noise at different spatial frequencies (Li et al., 2002), surrounded by a frame with broken-up segments of orange, blue, and purple. In this figure, the thickness of the color frames is exaggerated threefold for illustration.
Figure 1
 
Example stimuli for means of transport (A), animals (B), and distracters (C) as well as example masks (D). Masks are created by superimposing a naturalistic texture on a mixture of white noise at different spatial frequencies (Li et al., 2002), surrounded by a frame with broken-up segments of orange, blue, and purple. In this figure, the thickness of the color frames is exaggerated threefold for illustration.
Figure 2
 
The four tasks used in our experiments, here represented by the words used to cue subjects. The animal and vehicle detection tasks form the IMG task group because both relate to the natural scene photograph. Relating to the color frame around the image, the orange and the blue detection tasks make up the COL task group. In mixed-task blocks, any two of these four tasks are interleaved. Thus, we distinguish within-group switches (e.g., from “animal” to “transport”) from between-group switches (e.g., from “blue” to “animal”). Comparing switch costs between these two kinds of switches enables us to determine the cost of shifting attention to a different stimulus attribute.
Figure 2
 
The four tasks used in our experiments, here represented by the words used to cue subjects. The animal and vehicle detection tasks form the IMG task group because both relate to the natural scene photograph. Relating to the color frame around the image, the orange and the blue detection tasks make up the COL task group. In mixed-task blocks, any two of these four tasks are interleaved. Thus, we distinguish within-group switches (e.g., from “animal” to “transport”) from between-group switches (e.g., from “blue” to “animal”). Comparing switch costs between these two kinds of switches enables us to determine the cost of shifting attention to a different stimulus attribute.
Figure 3
 
Time course of a trial, starting 1,300 ms before target onset with a blank gray screen. At 650 ± 25 ms before target onset, a white fixation dot (4.1′ × 4.1′) is presented at the center of the display. At a variable cue-target interval (CTI) before target onset, a word cue (0.5° high, between 1.1° and 2.5° wide) appears at the center of the screen for 17 ms (two frames), temporarily replacing the fixation dot for CTIs less than 650 ms. At 0 ms, the target stimulus, consisting of a gray-level photograph and a color frame around it, is presented at a random position on a circle around the fixation dot, such that the image is centered around 6.4° eccentricity. After a stimulus onset asynchrony (SOA) of 200–242 ms, the target stimulus is replaced with a perceptual mask. The mask is presented for 500 ms, followed by 1,000 ms of blank gray screen to allow the subjects to respond. In case of an error, acoustic feedback is given (pure tone at 800 Hz for 100 ms), followed by 100 ms of silence. After this, the next trial commences.
Figure 3
 
Time course of a trial, starting 1,300 ms before target onset with a blank gray screen. At 650 ± 25 ms before target onset, a white fixation dot (4.1′ × 4.1′) is presented at the center of the display. At a variable cue-target interval (CTI) before target onset, a word cue (0.5° high, between 1.1° and 2.5° wide) appears at the center of the screen for 17 ms (two frames), temporarily replacing the fixation dot for CTIs less than 650 ms. At 0 ms, the target stimulus, consisting of a gray-level photograph and a color frame around it, is presented at a random position on a circle around the fixation dot, such that the image is centered around 6.4° eccentricity. After a stimulus onset asynchrony (SOA) of 200–242 ms, the target stimulus is replaced with a perceptual mask. The mask is presented for 500 ms, followed by 1,000 ms of blank gray screen to allow the subjects to respond. In case of an error, acoustic feedback is given (pure tone at 800 Hz for 100 ms), followed by 100 ms of silence. After this, the next trial commences.
Figure 4
 
Reaction times (RTs) for single-task blocks, task-repeat trials, and task-switch trials in mixed-task blocks for n = 6 subjects. Error bars are SEM. Both mixing and switch costs are significant for a CTI of 50 ms, but not for cue-target intervals (CTIs) of 200 and 800 ms. The drop of the single-task RT at 200 ms compared to 50 and 800 ms is not significant ( p > 0.05, t test).
Figure 4
 
Reaction times (RTs) for single-task blocks, task-repeat trials, and task-switch trials in mixed-task blocks for n = 6 subjects. Error bars are SEM. Both mixing and switch costs are significant for a CTI of 50 ms, but not for cue-target intervals (CTIs) of 200 and 800 ms. The drop of the single-task RT at 200 ms compared to 50 and 800 ms is not significant ( p > 0.05, t test).
Figure 5
 
Switch cost in reaction time (RT) at a cue-target interval (CTI) of 50 ms is only significant for between-group switches (red) but not for within-group switches (blue). Error bars are SEas defined in Equation 2. For an illustration of between- and within-group switches, see Figure 2. Note that the switch cost of 16.1 ms in Figure 4arises from pooling the data over the four switch conditions that are shown individually in this figure.
Figure 5
 
Switch cost in reaction time (RT) at a cue-target interval (CTI) of 50 ms is only significant for between-group switches (red) but not for within-group switches (blue). Error bars are SEas defined in Equation 2. For an illustration of between- and within-group switches, see Figure 2. Note that the switch cost of 16.1 ms in Figure 4arises from pooling the data over the four switch conditions that are shown individually in this figure.
Table 1
 
Stimulus probabilities depending on task.
Table 1
 
Stimulus probabilities depending on task.
Task Animal images Transport images Distracter images
Animal 50% 25% 25%
Transport 25% 50% 25%
Table 2
 
Three-way analysis of variance (ANOVA) for mixing cost in reaction time at cue-target interval (CTI) = 50 ms.
Table 2
 
Three-way analysis of variance (ANOVA) for mixing cost in reaction time at cue-target interval (CTI) = 50 ms.
Source df Mean square F p
CTI 2 5,616 18.55 0.0004 ***
Task group 1 1,730 5.71 0.038 *
Subject identity 5 2,407 7.95 0.003 **
Table 3
 
Reaction time mixing cost (in ms) by task group for the three values of the cue-target interval (CTI). Standard errors are given in brackets.
Table 3
 
Reaction time mixing cost (in ms) by task group for the three values of the cue-target interval (CTI). Standard errors are given in brackets.
Task group 50 ms 200 ms 800 ms
IMG 62.8 (11.5)*** 29.7 (11.9)* 9.7 (10.5)
COL 39.1 (12.8)** 13.1 (14.4) 4.2 (12.9)
Over all 49.0 (8.6)*** 19.7 (9.2) 3.6 (8.3)
Table 4
 
Four-way analysis of variance (ANOVA) for switch cost in reaction time.
Table 4
 
Four-way analysis of variance (ANOVA) for switch cost in reaction time.
Source df Mean square F p
CTI 2 1,559 6.01 0.006 **
Task group 1 140 0.54 0.47 ns
Subject identity 5 509 1.96 0.11 ns
Switch condition 1 4,390 16.93 0.0002 ***
Table 5
 
Switch cost (in ms) by switch condition for the three cue-target interval (CTI) values. Standard errors are given in brackets.
Table 5
 
Switch cost (in ms) by switch condition for the three cue-target interval (CTI) values. Standard errors are given in brackets.
Switch condition 50 ms 200 ms 800 ms
IMG→IMG 0.3 (11.5) −0.5 (10.4) 9.3 (11.4)
COL→COL 7.7 (11.6) −4.3 (13.5) −12.8 (10.8)
COL→IMG 23.5 (12.0)* 11.2 (11.6) 0.8 (10.6)
IMG→COL 30.7 (11.5)** 11.7 (12.1) −1.1 (10.7)
Over all 16.1 (6.9)* 4.6 (7.0) −1.2 (6.5)
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×