We have presented a neurodynamic model of saccade target selection, VWM, and their bidirectional interactions. In recent empirical work, features maintained in VWM have been shown to modulate the initial visual salience of perceptual stimuli to bias attention and gaze toward memory-matching objects (e.g., Hollingworth et al.,
2013b). In addition, we have demonstrated that the selection of an object as the saccade target biases the value of color representations in VWM toward the saccade target value (
Experiment 2 of the present study). Taken together, the results show that VWM cannot be adequately described through passive and static representations; rather, VWM is active and continuously coupled to attentional processes that control saccade target selection. Our model reflects this view by structuring VWM and saccade target selection as a dynamical system comprising active and continuously interacting representations. In particular, bidirectional interactions are realized by coupling VWM and saccade target selection systems to a shared, low-level sensory field, implementing feature-based and spatial attention effects on sensory processing. The quantitative fits of behavioral data demonstrate that the interactions between spatial and surface feature representations can indeed account for the observed dynamic and metric changes in saccade behavior. Although such model fits can never prove the validity of a model, they show that the conceptual explanation is viable and consistent and does not contain any hidden assumptions or undetected conflicts.
The neurodynamic model presented here is related to different previous lines of modeling work. It brings together separate DNF accounts of saccade planning and of VWM for spatial locations or surface features. In addition, it shares important features with previous work on the interactions of feature and spatial attention that have been used to explain the mechanisms underlying visual search (Hamker,
2003,
2005a,
2005b). In the next sections, we first discuss the relationship between the present model and earlier models of saccade planning and VWM. Then, we discuss the relationship with models of visual search.
DNF models of saccade behavior describe the neural process underlying saccade target selection and saccade initiation as the formation of an activation peak in a field defined over retinal space (Kopecz & Schöner,
1995; Marino et al.,
2012; Trappenberg et al.,
2001; Wilimzig et al.,
2006). This mechanism has been used to explain saccade latency effects in the gap-step-overlap paradigm (Kopecz & Schöner,
1995) and latency effects caused by the presence of distractors (Trappenberg et al.,
2001). Averaging saccades in the presence of a closely spaced target and distractor have been explained through merging of activation peaks (Trappenberg et al.,
2001; Wilimzig et al.,
2006), a mechanism used in the present study to reproduce the behavioral data for the near-distractor paradigm. The DNF approach also provides a straightforward way to integrate task-related, top-down and stimulus-driven, bottom-up information in saccade planning by providing separate inputs with different characteristics to the same field (Kopecz & Schöner,
1995; Trappenberg et al.,
2001).
In the details of the saccade mechanism, the spatial pathway in the present model is closest to the implementation of Trappenberg et al. (
2001). Both models use separate layers for saccade preparation and initiation (here the spatial attention field and saccade motor field), with the fixation activity present in the preparatory layer. This earlier model was presented explicitly as a model of saccade-related activity in the superior colliculus, whereas in our view the system also integrates functionally analogous aspects of the parietal cortex and frontal eye field activation. A significant innovation in the current work is the implementation of a space-code to rate-code transformation that generates a dynamically changing motor signal from the activation distribution in the DNF. This allows us to model the actual saccade execution (including the resulting shift in the visual image) and saccade termination as a result of neural dynamics, whereas previous models described only the processes leading up to the initiation of the saccade. The extended mechanism provides detailed saccades metrics and is in particular critical for capturing saccade amplitude effects in the target-only trials, which cannot be reproduced by simply reading out the position of the saccade motor peak.
DNF models of VWM have been used to explain various psychophysical results concerning VWM capacity and change detection performance (Johnson et al.,
2009a,
2009b). These models use self-sustained activation peaks as the memory substrate and assume a continuous coupling of WM and perception, which was also implemented in the current model. The coupling of WM to perceptual input makes the memory representations susceptible to change by subsequent perceptual states. For example, in the domain of spatial WM, Schutte and Spencer (
2009) used a multilayered DNF model to explain delay-dependent drift in spatial recall estimates relative to a perceived frame of reference. Notably, young children show biases toward a perceived midline axis. This bias is modulated by the distance between the remembered location and the axis, and bias magnitude increases systematically with increasing memory delay (see also Schutte & Spencer,
2010). These modulations of spatial memory by perceived environmental structure are mechanistically similar to the metric drift of color memory in
Experiment 2, when the saccade target was relatively close in color space to the memory color, suggesting that perceptual coupling and perceptually induced drift may be fundamental properties of visual and spatial WM systems.
One significant difference in these previous DNF models of VWM compared with the present feature pathway is the presence of a contrast layer in place of the attention field (Johnson et al.,
2009a,
2009b). This contrast layer receives excitatory sensory input and drives activation in the WM layer but is inhibited by feedback from the WM peaks. It thereby yields an active signal when a difference is detected between memorized and current visual input. The feature attention field in the present architecture instead is coupled in a purely excitatory fashion to the WM field. We consider these different connection patterns to reflect different functional aspects of visual processing, one enabling parallel change detection and the other responsible for the selection of individual items for focused processing. A recent scene representation model demonstrates that these two functions can be effectively integrated within a single DNF architecture (Schneegans, Spencer, & Schöner,
in press); however, future work will be needed to probe whether the integrated architecture effectively captures both the data from change detection studies and the memory-based attentional biases examined here.
The biasing effect of WM content is realized in the current model through the continuous bidirectional coupling between feature WM and feature attention. Recent experimental evidence indicates that there may be two forms of VWM: an active state that interacts with perceptual processing and an accessory or passive state that does not (Hollingworth & Hwang,
2013; Houtkamp & Roelfsema,
2006; Olivers et al.,
2011). In Hollingworth and Hwang (
2013), participants memorized two colors, one of which was subsequently cued as likely to be tested. In an intervening visual search task, only the cued color captured attention, as indicated by an increase in reaction time if that color was present as a distractor in the search display. Additionally, performance in the memory test was higher if the cued color was tested. We believe that such findings do not contradict the idea of continuous coupling between WM and attention. In fact, we propose that the combination of a multi-item WM representation with a selective (single item) representation for feature attention as implemented in the present model provides a potential mechanism for the different memory states. Critically, the accessory state is observed in experiments only when another WM item is in the active state. We propose that the active state is characterized by recruitment of the feature attention representation, such that mutually supportive regions of activation persist in both representations. The competitive interactions in the feature attention representations then suppress deployment of attention to other features and thereby prevent the other (passive) WM items from interacting with perceptual processing. The recruitment of feature attention would also stabilize the active WM memory item against random drift and decay of activation and thereby account for the increased memory performance for this item. It is yet to be tested whether this mechanism can indeed explain the different memory states, and further adjustments of the feature pathway and its parameters in the present model may be needed to account for the behavioral data.
The combination of spatial and feature pathways with a shared low-level visual representation implements an architecture similar to several models of visual attention, in particular the neurodynamic models of Hamker (
2003,
2005a,
2005b,
2006; also Fix et al.,
2011). These models described dynamic interactions between spatial and feature representations to account for electrophysiological data on the time course of feature attention effects (Chelazzi, Duncan, Miller, & Desimone,
1998; Chelazzi, Miller, Duncan, & Desimone,
2001) and have shown how target features in VWM can guide spatial attention to produce visual search behavior. For these visual search tasks, the models can select object locations in a spatial representation through covert attention or as targets for a saccadic eye movement. Although these approaches employ population code representations for space and surface features that enable them to capture metric effects in a fashion analogous to the present architecture, they have not previously been used to investigate metric effects of feature WM on individual saccade amplitudes (though saccade latencies for different types of visual search tasks have been modeled in Hamker,
2005a), nor have any of these approaches addressed effects of perceptual processing and attentional selection on VWM representations. Nevertheless, we believe that the neural mechanisms underlying visual search are likely to be the same as those that produce the metric effects in saccade behavior and memory representations in the present study, as reflected by the analogous mechanisms in the computational models.
One important conceptual aspect that we share with these previous approaches is how we conceive of the nature of visual attention. The deployment of attention is described as a continuous process that emerges from the interactions of different spatial and surface feature representations (as earlier proposed by Deco & Lee,
2002,
2004). The specific interaction patterns promote the selection of a single location and its associated surface features, which are represented more strongly at the expense of other locations and features. There is not, however, a strict requirement that attention has to be localized (it can be distributed at least transiently; compare Zirnsak, Beuth, & Hamker,
2011), and there is no discrete moment in time when an attentional selection takes place. This contrasts with many other models of visual search, where attentional selection occurs as a discrete, winner-takes-all selection in some form of spatial priority map (Navalpakkam & Itti,
2005; Wolfe,
1994). The resulting lack of distinct attentive and preattentive phases of processing in the present work is consistent with the empirical finding that effects of VWM occur even for simple saccades with latencies in the range of 80 to 130 ms (see, in particular, Hollingworth et al.,
2013a). Such results suggest that the interactions between features held in VWM and the processing of visual objects take place at an early, sensory stage, influencing the first sweep of sensory processing following stimulus appearance. The resulting view implemented in the computational model is that the visual salience of a particular object is a joint property of the object's physical attributes, feature biases (e.g., match between those attributes and VWM content), and spatial biases (e.g., partial knowledge of the target location in the present task).
Compared with extant models of visual search, the computational model presented here has several limitations. Some aspects of the implementation were intentionally simplified to reflect our focus on the details of low-level saccade planning and execution. In particular, we describe only a single spatial dimension because only horizontal saccade metrics were considered a relevant behavioral measure in the experiment. Moreover, the model describes interactions for only one surface feature dimension (color). In visual search tasks, the combination of different surface features is critical to capture the results of feature conjunction tasks and plays a central role in explaining the difference between serial and parallel search (Hamker,
2005a; Wolfe,
1994). However, a recent extension of the model presented here does introduce additional surface feature dimensions to address multifeature change detection tasks, and describes neural mechanisms for the parallel detection of simple feature changes and sequential detection of feature conjunction changes (Schneegans et al., in press). A further simplification in the present model (shared with the visual search models described here) is that it does not capture any increase in the complexity of visual features along the surface feature pathway. This is again driven by the focus on simple color effects in the experiment, although it ignores the generation of color representations from simple color opponency pairs at the earliest levels of perceptual processing.
What the present work achieves compared with earlier approaches is to expand a dynamic explanation of visual attention from qualitative effects observed in visual search to metric effects in simple saccade planning and WM performance. The psychophysical experiments presented here and in previous related work (Hollingworth et al.,
2013a,
2013b) provide a new method for investigating and quantifying the interactions between spatial and surface feature representations, and the results provide important constraints for modeling visual attention. The computational model demonstrates how VWM for surface features influences even the metric details of saccade planning and execution and how conversely the detailed content of VWM is affected by perceptual processing and attentional selection of visual objects.