Humans use specific spatial reference frames to retain remembered visual information for a goal-directed action (
Crawford, Henriques, & Medendorp, 2011;
Soechting & Flanders, 1992). The visual system is thought to utilize two types of spatial reference frames: observer-centered, egocentric reference frames versus world-fixed, allocentric reference frames, often anchored relative to reliable landmarks (
Byrne, Cappadocia, & Crawford, 2010;
Howard & Templeton, 1966;
Vogeley & Fink, 2003). In ordinary circumstances, the brain integrates information from these two reference frames in Bayesian manner (
Byrne & Crawford, 2010;
Fiehler, Wolf, Klinghammer, & Blohm, 2014). However, certain circumstances may require one to ignore surrounding landmarks (i.e., when they are unstable or irrelevant to the task) or focus strongly on surrounding landmarks (such as a workspace fixed to a moving base). Various experiments have tapped into these mechanisms by instructing participants to use one reference frame over the other (
Byrne & Crawford, 2010;
Lemay, Bertram, & Stelmach, 2004). The question thus arises regarding which of these instructions leads to better performance?
Several studies have explored the implicit influence of visual landmarks on goal-directed actions such as saccades, reaches, or pointing toward a seen or remembered target. Previous findings have shown that reach targets can be remembered reasonably well in the absence of visual landmarks (
Lemay & Stelmach, 2005;
McIntyre, Stratta, & Lacquaniti, 1997;
Vindras & Viviani, 1998) with certain stereotypical errors such as gaze-centered overshoots (
Bock, 1986;
Henriques, Klier, Smith, Lowy, & Crawford, 1998). However, the addition of a visual landmark can influence reaching, reducing both constant and variable errors (
Byrne et al., 2010;
Krigolson & Heath, 2004;
Lemay et al., 2004;
Redon & Hay, 2005). The stabilizing influence of a landmark was particularly prominent in a task that involved remapping the reach target to the opposite visual hemifield, where one would expect egocentric signals to be less stable (
Byrne et al., 2010). Conversely, the landmarks had less stabilizing influence on behavior when they were shifted and rotated relative to the reach goal (
Thaler & Todd, 2009). Finally, the addition of visual landmarks can negate the accumulation of reach errors after prolonged memory delays in the dark (
Chen, Byrne, & Crawford, 2011).
Normally, egocentric and allocentric cues agree, but they can also conflict, either in tasks that introduce egocentric noise or when the visual environment is unstable (
Byrne & Crawford, 2010;
Byrne et al., 2010;
Chen et al., 2011). The latter has been replicated experimentally in cue-conflict tasks where the landmark is surreptitiously shifted relative to egocentric coordinates during a memory delay (
Byrne & Crawford, 2010). In this situation, ego/allocentric cues appear to be optimally integrated, based on their relative reliability (
Byrne & Crawford, 2010). Usually, more weight is placed on egocentric coordinates, such that the movement shifts approximately 1/3 in the direction of the landmark shift (
Byrne & Crawford, 2010;
Fiehler et al., 2014;
Li, Sajad et al., 2017). However, the specific weighting depends on task details. For example,
Byrne and Crawford (2010) found that participants relied more on a landmark when it was perceived to be stable or when gaze position was less stable. Further, in simulated naturalistic settings, landmarks had more influence when they were task relevant, when more than one landmark was shifted in the same direction, and when the landmark was closer to the target (
Fiehler et al., 2014). Thus, in the absence of explicit instructions, the visual system uses implicit algorithms to determine how to weight ego/allocentric cues. Recent physiological studies suggest that this implicit integration may occur in frontal cortex (
Bharmauria et al., 2020;
Bharmauria, Sajad, Yan, Wang, & Crawford, 2021).
Alternatively, people can be instructed to ignore or put full weight on a visual landmark, such as in common driving instructions—for example, “Ignore the first stop sign and turn right at the second.” Likewise, experimental participants can be instructed to either ignore a landmark or reach to a fixed egocentric location relative to the landmark. Such instructions were used in neuropsychological and neuroimaging experiments that suggest involvement of the ventral visual stream in allocentric representations and dorsal stream in egocentric transformations (
Chen et al., 2014;
Chen, Monaco, & Crawford, 2018;
Goodale, Westwood, & Milner, 2004;
Schenk, 2006).
Thus, the influence of a landmark on goal-directed action can be determined by both bottom–up factors (e.g., priors, reliability) or by explicit top–down task instructions (e.g., prioritize egocentric vs. allocentric cues). However, the influence of the instruction itself on performance (accuracy, precision, and reaction time) is less clear, particularly when egocentric and allocentric cues conflict. For example, in their control tasks,
Byrne and Crawford (2010) found no overall difference in performance when participants were explicitly instructed to use egocentric or allocentric cues, but there was no cue conflict in these tasks, and the visual stimuli were not held constant. Instruction has a well-documented impact on perception, as seen in dichotic listening tasks, where they influence encoding and recall, often overriding other stimuli except highly significant ones such as the participant's name (
Moray, 1959). Conversely, perceptual illusions, such as the pong effect, are resistant to instruction, as shown by
Laitin and Witt (2020), where guiding instructions did not reduce susceptibility to the illusion. Based on this, it is reasonable to expect that ego/allocentric instruction might also affect visually guided action, beyond the simple task switching intended in the design of an experiment.
Here, we tested the influence of instruction (to use or not use a landmark) in a cue-conflict, memory-guided reach task where the visual stimuli were equal and balanced across tasks. By forcing participants to either ignore the landmark or use it for the task, we could isolate and directly compare how top–down reliance on allocentric versus egocentric cues affected reach behavior. It has been suggested that landmark-centered encoding is a less noisy process and more stable over a time delay (
Byrne et al., 2010;
Chen et al., 2011). Thus, we predicted that instruction to attend to and use the landmark for spatial coding would increase weighting on the more stable code and therefore improve performance, especially when the task complexity increases egocentric noise (
Byrne & Crawford, 2010;
Chen et al., 2011). To simulate the latter case, we included an “anti-reach” condition where participants had to reach toward the mirror-opposite position relative to the visual fixation point (
Cappadocia, Monaco, Chen, Blohm, & Crawford, 2017;
Everling & Munoz, 2000;
Gail & Andersen, 2006). We found that (a) reaching was less variable and more accurate in the allocentric instruction tasks than egocentric instruction tasks (especially in the right visual field), and (b) the beneficial effect of allocentric encoding was more pronounced when participants were required to respond in the visual field opposite the stimulus encoding.