Abstract
What is the reference frame for attentional tracking of multiple targets? The premotor theory of spatial attention predicts a retinotopic reference frame, because if peripheral attention is akin to saccadic preparation then it should be generated relative to fixation. However, object-based attention theories predict a more allocentric reference frame because targets are perceptually grouped into a non-rigid virtual object from their arrangement in the display. We tested these theories by discretely shifting the entire visual display during tracking, thereby changing the retinal coordinates of the targets while keeping constant their relative arrangement. Participants were asked to keep track of a subset of 2, 3 or 4 disks in a display of 8 disks randomly moving inside a display box (10×10 degrees visual angle) on a computer monitor (24×32 dva). At 1-second intervals during the 9-second tracking task, all the disks disappeared for 0.5 seconds. When the disks disappeared, the box either immediately shifted to one of eight positions on the monitor or remained in the same location. The disks then reappeared in the same position as they disappeared relative to the display box. Average tracking performance was impaired when the tracking display shifted (62% correct) compared to when the disks merely blanked without shifting (86% correct; F(2,143)=35, p < .001). Shifting the display in a predictable way (clockwise around the monitor), in order to encourage predictable eye movements, improved performance (62% to 68%, F(1,65)=4, p < .05). Encouraging participants to group targets into a non-rigid virtual object, both with explicit instruction and a canonical arrangement of the targets at the start of the trial (as in Yantis, 1992, Cog. Psy. 24:295–340) had no effect on performance. These experiments demonstrate that attentional tracking is impaired by discrete display translations, suggesting that a retinotopic reference frame may be used by the mechanism that tracks target motion.