Free
Research Article  |   March 2009
Viewing task influences eye movement control during active scene perception
Author Affiliations
Journal of Vision March 2009, Vol.9, 6. doi:https://doi.org/10.1167/9.3.6
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Monica S. Castelhano, Michael L. Mack, John M. Henderson; Viewing task influences eye movement control during active scene perception. Journal of Vision 2009;9(3):6. https://doi.org/10.1167/9.3.6.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Expanding on the seminal work of G. Buswell (1935) and I. A. Yarbus (1967), we investigated how task instruction influences specific parameters of eye movement control. In the present study, 20 participants viewed color photographs of natural scenes under two instruction sets: visual search and memorization. Results showed that task influenced a number of eye movement measures including the number of fixations and gaze duration on specific objects. Additional analyses revealed that the areas fixated were qualitatively different between the two tasks. However, other measures such as average saccade amplitude and individual fixation durations remained constant across the viewing of the scene and across tasks. The present study demonstrates that viewing task biases the selection of scene regions and aggregate measures of fixation time on those regions but does not influence other measures, such as the duration of individual fixations.

Introduction
Human vision is an active, dynamic process in which the viewer seeks out specific visual input as needed to support ongoing cognitive and behavioral activity (Findlay & Gilchrist, 2001; Henderson, 2003, 2007). A critical aspect of active vision is directing the eyes to task-relevant stimuli in the environment. Current theoretical treatments of human gaze control in active vision focus on two sources of input to the control system: stimulus-based sources and cognitive sources. Cognitive control, though highlighted in two classic eye movement studies (Buswell, 1935; Yarbus, 1967) and a dominant theoretical driving force in experimental psychological studies in the 1960s and 1970s (e.g., Antes, 1974; Friedman, 1979; Loftus & Mackworth, 1978; Mackworth & Morandi, 1967), has taken a less central role in recent theorizing about the control of eye movements in scene perception (Itti & Koch, 2001; Koch & Ullman, 1985; Parkhurst, Law, & Niebur, 2002; Parkhurst & Niebur, 2003). In the present study, we were concerned with a particular aspect of cognitive control: the influence of viewing task on the control of eye movements during scene viewing. 
Buswell (1935) and Yarbus (1967) both examined the influence of viewing task on eye movements during complex picture viewing. These studies began with the observation that eye fixations are not randomly distributed in a scene but instead tend to cluster on some regions at the expense of others. For example, in Chapter 6, Buswell (1935) asked a group of participants to first look at a picture of a tower under free viewing instructions and then look to find a person in one of the windows in the tower. He noted that the fixation distribution changed dramatically (as would be expected): participants fixated for longer and more often when examining the tower with the search instructions than with the free viewing instructions. He also asked them to look once at a scene with no particular instructions and then a second time, after reading a description of the picture. He found that the number of fixations made after the description was read increased substantially (from 61 to 108 fixations). He surmised that these changes in fixation number across both experiments were due to the instructions (or descriptions) arousing the interest of participants to certain parts of the picture. He referred to these effects as changing the “mental set” of participants as they viewed the pictures. 
In a classic comparison that has been widely described, Yarbus (1967) asked a participant to look at the painting, “The Unexpected Visitor” by I.E. Repin under seven different viewing instructions. Yarbus (1967, p. 174) showed that eye movement patterns differed dramatically with task instructions. For example, fixations were more likely to land on the faces of the people in the image when the instructions were to estimate the age of the people in the painting rather than estimate the material circumstances of the family. 
Both the Buswell and Yarbus studies strongly suggest that task affects eye movement behavior during scene viewing. However, in both of these studies, viewing task was not manipulated in a controlled manner. For example, in both studies, task instruction was confounded with viewing order. In addition, the reported data were sparse and qualitative: both Buswell and Yarbus simply presented the scan patterns generated for each task and both described the eye movement pattern according to which parts of the picture seemed to have a greater cluster of fixations. Buswell reported the number of fixations overall, but these were confounded with the fact that participants were allowed to view the picture for as long as they pleased, with no recording of those times. Furthermore, although it is clear from these studies that cognitive processes influence viewing patterns, it is unclear which specific eye movement parameters are affected by task. More fine-grained measures of eye movement behavior were not presented. We are aware of no published study that has attempted to directly examine eye movements as a function of viewing task in a fully controlled experimental design. 
The second goal of the present study was to examine the nature of fixation duration control during scene viewing. In comparison to the reading literature, few research studies have investigated the control of fixation durations during scene perception (Castelhano & Rayner, 2008; Henderson, 2003; Henderson & Hollingworth, 1998; Rayner, 1998). The reason for this is quite simple: across most studies, the task in reading is well understood to be to comprehend the text. Therefore, models and theories of eye movements of reading have been focused on underlying cognitive processes and have been able to account for such measures as fixation placement and fixation duration with great success (Engbert, Longtin, & Kliegl, 2002; Engbert, Nuthmann, Richter, & Kliegl, 2005; Kliegl & Engbert, 2003; Reichle, Pollatsek, Fisher, & Rayner, 1998; Reichle, Pollatsek, & Rayner, 2006; Reichle, Rayner, & Pollatsek, 2003). In contrast, with scene perception the task is often varied and sometimes vague (e.g., free viewing instructions or viewing for a preference rating). As a result, it is more difficult to pinpoint the underlying processes affecting fixation durations (but see Henderson & Pierce, 2008; Henderson & Smith, in press; van Diepen & d'Ydewalle, 2003). 
A number of studies on scene perception and eye movement control have been concerned with the nature of image properties that are related to fixation location (Henderson, Brockmole, Castelhano, & Mack, 2007; Krieger, Rentschler, Hauske, Schill, & Zetzsche, 2000; Mannan, Ruddock, & Wooding, 1996, 1997; Parkhurst et al., 2002; Parkhurst & Niebur, 2003; Reinagel & Zador, 1999; Tatler, Baddeley, & Gilchrist, 2005), and computational models that have reflected this focus on the placement of fixations (Itti & Koch, 2001; Torralba, Oliva, Castelhano, & Henderson, 2006). Though fixation position is an important component of eye movement behavior, an equally important component is the amount of time the eyes remain fixated in a particular scene region. Models of eye movement control that focus on location but ignore duration lead to an incomplete account and can even lead to misleading conclusions about the nature of eye movement control (Henderson, 2003). The average fixation durations across the entire scene as well as the summed durations of consecutive fixations on a specific region of interest are related to the ongoing perceptual and cognitive activity (Castelhano & Henderson, 2008b; Henderson & Ferreira, 2004; Henderson, Weeks, & Hollingworth, 1999; Rayner, 1998). 
Intimately related to control of fixation placement and duration is the question of cognitive control over saccade amplitudes across the entire scene. Reviews of natural scene studies report that the average saccade amplitude is between 4° and 5° on scenes that are on average 20°–30° wide (Antes, 1974; Henderson & Hollingworth, 1998; Rayner, 1998). It is clear from examination of Buswell's (1935) and Yarbus' (1967) studies that saccade amplitude is also affected by task instruction and so naturally would be subject to some degree of cognitive control. However, because saccade amplitude (and by proxy, fixation placement) is governed by information not being directly fixated, it is unclear to what extent stimulus factors versus cognitive processes are influencing saccade patterns. Generally, the degree to which fixation duration and placement is influenced by viewing task in scene perception is largely unexplored (but see Henderson et al., 1999). 
A related issue concerns the manner in which fixation durations change over the course of viewing a scene. This issue has been investigated in a number of studies but has led to inconsistent results (Antes, 1974; De Graef, Christiaens, & d'Ydewalle, 1990; Friedman & Liebelt, 1981; Henderson et al., 1999; Unema, Pannasch, Joos, & Velichkovsky, 2005). Antes (1974) had participants view 10 different paintings for 20 s each under the instructions that they were to examine each one and choose their preferred painting at the end. With regard to fixation durations across the course of viewing a single painting, Antes found that the early fixations on the paintings tended to be of shorter duration (∼215 ms) than those in the later part of the viewing (∼310 ms) across all paintings. Antes related this trend to the participants' initial scanning of the painting with further scrutiny later on. However, he also noted that early fixations tended to fall on more informative areas initially and later fixations on less informative ones. It is not clear that the change in fixations monitored over the course of viewing was due to a change in viewing strategy (i.e., the same task of deciding preference in mind), or reflected differences in the task itself (i.e., the participant is initially scanning the picture, followed by a bored perusal of the less interesting areas while waiting for the viewing period to end). Although some studies have reported this increase of fixation durations over the initial viewing of a scene (Antes, 1974; Unema et al., 2005), other studies have found the opposite pattern or no difference in fixation durations over the course of viewing the scene (De Graef et al., 1990). Differences across stimuli and task instructions across studies make comparison of these results difficult. 
It is clear from the studies reviewed above that task has a great influence on eye movements in scene viewing (Buswell, 1935; Henderson et al., 1999; Yarbus, 1967) just as it does in reading (e.g., reading vs. skimming, Masson, 1983). However, in scene viewing, it is unclear which specific eye movement parameters are affected by task. In the present study, we investigate the nature of fixation durations during scene viewing, how durations may differ given the task instructions, and how fixation durations fluctuate across the period spent viewing the scene. We compared the eye movement patterns of participants while they performed one of two scene viewing tasks: visual search and memorization. 
Methods
Participants
Twenty members of the Michigan State University undergraduate participant pool participated in this experiment. All had normal, uncorrected vision and received course credit or $7 for participation. 
Stimuli
Thirty-five photographs of complex real-world indoor and outdoor scenes were digitized for the experiment. Thirty of the scenes were the critical stimuli used in the analysis, and five were used as target-present fillers in the search condition. The scenes were displayed at a resolution of 800 × 600 pixels × 32,768 colors and subtended 15° of visual angle horizontally and 12° vertically at the viewing distance of 1.13 m. In the visual search condition, a word naming the search target was presented prior to the scene for that trial. The word was printed in a Times New Roman font in black against a gray background and subtended 4° by 1° on average. 
Apparatus
The stimuli were displayed on an NEC Multisync XE 15-inch monitor driven by a Hercules Dynamite Pro super video graphics adapter (SVGA) card with a refresh rate of 100 Hz. Eye movements were monitored using a Generation 5.5 Stanford Research Institute Dual Purkinje Image Eye Tracker (Crane, 1994), which has a resolution of 1 min of arc and a linear output over the range of the visual display used. A bite-bar and forehead rest maintained the participant's viewing position and distance. The right eye was tracked, though viewing was binocular. Signals were sampled from the eye tracker using the polling mode of the Data Translations DT2803 analog-to-digital converter, producing a sampling rate slightly greater than 1000 Hz. The display system and eye tracker were interfaced with a computer running a 90-MHz Pentium processor. The computer controlled the experiment and kept a complete record of eye position and time values over the course of each trial. 
Procedure
Participants first read a description of the experiment along with a set of instructions. The instructions indicated that participants were taking part in two experiments, one involving memorizing scenes, and the other involving searching for particular objects in scenes. Participants were told that their eye movements would be monitored during these tasks. Task was blocked and the order of task block was counterbalanced across participants. Each viewing task was explained separately just before it was to be performed, allowing each participant to receive a short break between the two task blocks. 
For the memorization task, participants were instructed to view each scene in preparation for a later memory test that would be administered at the end of the session. They were told that the memory test would examine memory for specific objects in the scenes. Following the instructions, the eye tracker was calibrated and three practice trials were given, after which the eye tracker was recalibrated and the participant viewed 15 scenes for 10 s each. For the search task, participants were instructed to locate within each scene the target object specified for that trial. They were further instructed that when they had found the target to continue to look at the target and press the response button. Prior to each trial, a word naming the target for that trial was presented in the center of the display for 2 s. The scene was then displayed for 10 s or until the participant pressed the response button indicating that the target was or was not present. To encourage participants to search the critical scenes exhaustively, the search target was never present in the 15 critical scenes. The search target was however present in the five filler scenes. 
In both conditions, eye tracker calibration consisted of having the participant fixate 4 calibration markers at the top, bottom, left, and right sides of the display area. Calibration was checked by displaying a calibration screen consisting of 5 test positions (center, top, bottom, and left and right sides) and a fixation marker that indicated the computer's estimate of the current fixation position. The participant fixated the test positions, and if the fixation marker was approximately within ±8 min arc of each (this is equivalent to 8 pixels, which was defined on the screen as a box surrounding the fixation marker), calibration was considered accurate. 
A trial consisted of the following events. First, the calibration screen was shown and calibration was checked. The eye tracker was recalibrated whenever calibration was deemed inaccurate. Following the calibration check, the participant fixated the center fixation cross on a gray background to indicate that he or she was ready for the trial to begin. The experimenter then started the trial: The fixation display was replaced by the trial scene in the memorization condition and by the search target word for 2 s followed by the search scene in the search condition. The scene remained visible for 10 s in both conditions unless the participant pressed the response button in the search condition. Following scene offset, the calibration screen reappeared. 
Each participant saw all 30 critical scenes plus 5 search filler scenes in a within-subject design. The 30 critical scenes were divided into two groups and the assignment of scene group to task condition was counterbalanced across participants so that each scene appeared in each task condition an equal number of times, but each participant saw each scene only once. The order of scene presentation in each task block was determined randomly for each participant. The entire experiment lasted approximately 45 minutes. 
Results
Although participants were presented with 35 scenes in total, only the 30 critical scenes were included in the analysis. The visual search scenes analyzed did not have a target object in order to maximize the amount of time participants spent viewing the scene and to better approximate the time spent viewing the scene in the memorization task. 
Task performance
Participants reported approaching the two blocks as separate tasks. Memory performance on a later difficult object memory test in the memorization condition was over 80% correct. The memory test was a two-alternative forced-choice (2AFC) in which participants had to discriminate between a single object cropped from a previously viewed scene (in either of the tasks) and a distractor matched at its basic-level category, making it a relatively difficult memory test. A report focusing on the performance data for the memorization and search tasks can be found in Experiment 1 of Castelhano and Henderson (2005). Total scene viewing time was 8.7 s on average in the search condition, and 69% of the search scenes were viewed for the entire 10 s maximum. Therefore, the Search task was on average 1.3 s shorter than the Memorization task. However, this difference did not have any consequences for the measurements of interest, as shown below. Therefore, the primary eye movement analyses were based on all the data. 
Eye movement data analysis
Our primary concern in the present study was the nature of eye movement behavior as a function of viewing task. Raw eye movement data files consisted of time and position values. Saccades were defined as changes in eye position greater than 8 pixels (about 8.8 arcmins) in 15 ms or less. Manual inspection of the raw data files confirmed that this criterion effectively eliminated saccades while preserving slow drifts. Once saccades had been identified, fixation positions and durations were computed over the remaining data. The duration of a fixation was the elapsed time between two consecutive saccades. During a fixation, the eyes often drift. The scored position for a given fixation was the Euclidean average of the position samples (in pixel values) taken during that fixation weighted by the durations of each of those position samples. Thus, if the eye position drifted over two pixels before coming to rest on a third, the scored location of that fixation was the average value of the three pixel positions weighted by the amount of time the eyes were directed at each of the three pixels. The duration of that fixation was then taken as the sum of time on those three pixels (see Henderson, McClure, Pierce, & Schrock, 1997). Following this procedure, fixation durations less than 90 ms were removed to eliminate fixation artifacts resulting from signal overshoot in the dual-Purkinje Image eye tracker and fixation durations greater than 2000 ms were excluded as outliers, most likely resulting from false locks in tracking. This procedure eliminated 5.6% in the Visual Search task and 7.1% in the Memorization task. All data reduction and analysis were conducted using automated analysis software. 
Figure 1 shows the typical viewing patterns for two participants looking at a scene in the memorization and search instruction conditions, respectively. The circles represent fixations and the straight lines represent saccades. Consistent with the results of prior studies, viewers generally distributed their fixations across a large part of each scene, with the majority of fixations landing on or near objects (Buswell, 1935; Yarbus, 1967; see Henderson & Ferreira, 2004, for review). 
Figure 1
 
Typical viewing patterns for two participants looking at a scene in the (A) Memorization and (B) Visual Search instruction conditions, respectively. The participants were asked to look for a bucket in the visual search task.
Figure 1
 
Typical viewing patterns for two participants looking at a scene in the (A) Memorization and (B) Visual Search instruction conditions, respectively. The participants were asked to look for a bucket in the visual search task.
Several measures were calculated to quantify viewers' eye movement patterns on the scenes and on specific objects as a function of viewing task. These measures are reported below. Eye movement measures for the entire scene will first be reported, followed by eye movement measures related to specific object processing in the scene. 
Eye movement behavior on whole scene
For the analyses of the whole scene, eye movement measures were analyzed from scene onset to the button press or timer expiration that terminated the trial. These analyses were conducted to provide a global overview of eye movements during the task and to allow comparison against eye movement behavior in other types of scene viewing studies. 
We examined ten eye movement measures to address the degree to which fixation patterns in complex real-world scenes change as a function of viewing task. The data from these measures are summarized in Table 1 and Figures 2 3 45
Table 1
 
Global measures of eye movement behavior in complex real world as a function of viewing task.
Table 1
 
Global measures of eye movement behavior in complex real world as a function of viewing task.
Memorization task Visual search task Difference
Mean SE Mean SE
Percentage of scene area fixated 48% 0.08% 37% 0.09% 11%*
Total scan path length 82° 2.2° 74° 2.4° 8°**
Total number of fixations 27.6 0.09 24.2 0.08 3.4**
Average fixation duration (ms) 287 0.29 292 0.24 −5
Average saccade amplitude (deg) 3.0° 0.03 3.1° 0.03 −0.1
Elapsed time to first saccade execution (ms) 317 9.73 269 7.14 48**
 

Note: * p < 0.05; ** p < 0.01.

Figure 2
 
The distribution of all fixations of all participants over the same scene. (Top) The original image viewed by participants. (Bottom left) The image showing placement of all fixation in the Memorization task condition. (Bottom right) The image showing fixation placement in the Visual Search task condition. As can be seen qualitatively in this figure, fixations tended to be more distributed in the memory condition and more focused on search-relevant regions in the search condition. In this image, participants were asked to look for a bucket and fixations were concentrated in the windows of the hardware store (see text for more information on how these images were created).
Figure 2
 
The distribution of all fixations of all participants over the same scene. (Top) The original image viewed by participants. (Bottom left) The image showing placement of all fixation in the Memorization task condition. (Bottom right) The image showing fixation placement in the Visual Search task condition. As can be seen qualitatively in this figure, fixations tended to be more distributed in the memory condition and more focused on search-relevant regions in the search condition. In this image, participants were asked to look for a bucket and fixations were concentrated in the windows of the hardware store (see text for more information on how these images were created).
Figure 3
 
The fixation duration distribution for each task.
Figure 3
 
The fixation duration distribution for each task.
Figure 4
 
(A) The fixation duration by ordinal fixation number for each task. (B) The saccade amplitude by ordinal fixation number for each task. In order to quantify the differences, the first five saccades were analyzed for the ordinal fixation durations (see text for more details).
Figure 4
 
(A) The fixation duration by ordinal fixation number for each task. (B) The saccade amplitude by ordinal fixation number for each task. In order to quantify the differences, the first five saccades were analyzed for the ordinal fixation durations (see text for more details).
Figure 5
 
The saccade amplitude distribution for each task.
Figure 5
 
The saccade amplitude distribution for each task.
Spatial distribution of fixations
The first issue we examined was the degree to which eye movements during scene viewing are distributed differently as a function of viewing task. As discussed in the Introduction section, eye movements are typically distributed over informative scene regions at the expense of uniform and uninformative regions. Whether or not a given region is informative should be at least partly determined by its task relevance. Therefore, to the extent that cognitive factors influence the global distribution of fixations over a scene, the prediction is that the spatial distributions of fixations will change as a function of viewing task. 
Figure 2 shows the distribution of fixations in a scene for all participants over the same scene performing each of the task conditions. The figure was created by convolving a fixation map with a Gaussian filter. Each fixation was weighted according to the fixation's duration and the Gaussian filter was defined by a standard deviation of 1 degree to approximate the size of the fovea (for similar analyses of fixation distributions, see Henderson, 2003; Pomplun, Ritter, & Velichkovsky, 1996). To quantify the spatial distribution of fixations shown in Figure 2, we computed the percentage of each scene that was foveated across participants. To calculate the area fixated, a circular filter (1° radius) was placed centered on each fixation for each scene under each viewing condition; thus for each scene, the summed area occupied by the circular filters represented the total area fixated. By this measure, on average, participants fixated 48% of the area of the scenes in the memorization condition and 37% in the visual search condition across scenes, t(29) = 4.89, p < 0.05. These data thus support the qualitative observation that participants distributed their fixations more widely in the memorization condition and concentrated fixations to certain areas in the visual search condition that presumably would most likely contain the target object. 
Total scan path length
As an additional measure of the dispersion of fixations in each of the tasks, we also calculated the total scan path length. 1 In degrees of visual angle, we found that on average the total length of the scan path was significantly greater in the Memorization task (82°) than in the Visual search task (74°), t(19) = 2.77, p < 0.01. 
Total number of fixations
Related to the measure of total scan path length, we also examined the total number of fixations. We found that the total number of fixations was greater in the Memorization task (∼28) than the Visual Search task (∼24), t(19) = 5.21, p < 0.001. 
Although we found significant differences in the total scan path length and the total number of fixations as a function of task, it is possible that these differences were due to differences in viewing time. Specifically, for the Visual Search task, the total number of fixations was ∼87% of the Memorization task and the total scan path length was ∼90% of the Memorization task. This is equivalent to the difference in viewing times (viewing times for the Visual Search task is ∼86% of the Memorization task). 2 We conducted a second analysis to equate viewing times for these two measures by including only fixations occurring within the first 8.7 s for each task type. We found that the task had a marginal effect on total scan path length (Memorization: 72° and Visual Search: 66°, t(19) = 1.89, p = 0.075) and had a significant effect on fixation count (Memorization: 24 and Visual Search: 22, t(19) = 4.19, p < 0.01). Taken together, the effect of task on these two measures seems to be due to more than simply the increased time participants spent viewing the scene in the Memorization task. 
Fixation duration distribution
The fixation duration distribution for each task is shown in Figure 3. The shapes of the distributions for each task were similar. To examine the distributions more closely, we analyzed the mean and ordinal initial fixation durations for each task. 
Average fixation duration
Fixation duration has been shown to be a sensitive measure of ongoing cognitive processing in reading (Rayner, 1998) and similarly reflects ongoing processing in scene viewing (Henderson & Pierce, 2008; Henderson & Smith, in press). To determine whether overall fixation durations were on average different in the two task conditions, we computed each participant's average fixation duration across all scenes and compared the participant means as a function of viewing task. The average fixation duration was a non-significant 5 ms shorter in the memorization than the visual search condition, t(19) = 0.799, ns. 
Fixation duration by ordinal fixation number
It has sometimes been reported that fixation duration changes over the course of scene viewing (Antes, 1974; Friedman & Liebelt, 1981; Unema et al., 2005). We therefore conducted an analysis of fixation durations as a function of ordinal fixation number in the scene and viewing task to determine whether durations change over the time course of scene viewing, and if so, whether differences in durations as a function of task might be revealed at different ordinal time points (Figure 4A). Analysis of the first five fixations on the scene revealed an effect of ordinal fixation number, F(19) = 18.03, p < 0.01, but no effect of task, F < 1. Further analyses over the first five fixations showed a significant linear trend present for the Visual Search task, F(98) = 42.86, p < 0.01, and Memorization task, F(98) = 8.93, p < 0.01. There were no effects for the last 5 fixations (all F's < 1).3 The increase in fixation durations over the first few fixations during the initial viewing of the scene in both tasks suggests a quick initial scene scan. 
Elapsed time to first saccade execution
At the onset of the scene, the participants' fixation is directed at the center of the scene. The initial perception of the scene involves identifying the scene being presented (Castelhano & Henderson, 2007, 2008a; Potter, 1976; Schyns & Oliva, 1994) but also deciding and planning where to direct the next fixation (Castelhano & Henderson, 2007). Holding all else equal, it seems reasonable to assume that differences across tasks in the time to execute the first saccade after scene onset is due to a combination of these processes. We analyzed the elapsed time to the first fixation and found that there was an effect of task, t(19) = 4.71, p < 0.01. Time to the execution of the first saccade upon scene onset was 48 ms longer in Memorization than in Visual Search, suggesting there was a greater time spent committing the information to memory and possibly on planning the first eye movement. 
Saccade amplitude distribution
Figure 5 shows the frequency distribution of saccade amplitudes for each task condition. The saccade distributions for each task appear similar. To look at possible differences in saccade amplitude across task conditions, we analyzed the average saccade amplitude and the initial ordinal saccade amplitudes. 
Average saccade amplitude
The average saccade amplitude was only 0.06° longer in the Visual Search task than the Memorization task and did not differ significantly, t(19) = −0.62, ns. To more closely inspect a possible effect of task on saccade amplitude, we also analyzed the initial saccades made after scene onset. 
Saccade amplitude by ordinal fixation number
Figure 4B shows the average saccade amplitude by ordinal fixation number in each task. To analyze these distribution differences, we looked at the average length of the first 5 saccades made on the scene for each task in a within-subject ANOVA. There was an effect of task, F(1,19) = 9.73, p < 0.01, in which initial saccades were longer on average during Visual Search than Memorization. There was also an effect of ordinal saccade amplitude, F(1,19) = 5.93, p < 0.01. Further analyses showed significant linear trends for both the Visual Search task, F(1,98) = 8.87, p < 0.01 and Memorization task, F(1,98) = 4.81, p < 0.05. There was no significant interaction. 
Eye movement behavior on objects during scene viewing
For the object-level analyses reported below, eye movement behavior on discrete objects in the scenes was of interest. Scoring regions for these objects were defined by a rectangular box that was just large enough to encompass that object (see Figure 6). The pixel coordinates of the box were then taken as the position of the object. The same objects and scoring boxes were used in the analysis of the two viewing instructions. For the object analyses, each fixation in a scene was determined to be within or outside of the scoring box based on its pixel position value as defined above. Assignment of fixation position to object regions was independent of the initial generation of fixation positions. 
Figure 6
 
A sample scene in which three isolated objects were defined in order to analyze differences in eye movement behavior on discrete objects. The only criterion for each scene was that the three chosen objects were not occluded.
Figure 6
 
A sample scene in which three isolated objects were defined in order to analyze differences in eye movement behavior on discrete objects. The only criterion for each scene was that the three chosen objects were not occluded.
We examined eight measures of eye movement behavior to address the degree to which eye movements on objects in complex real-world scenes changes as a function of viewing task. With the exception of the proportion of objects fixated, each measure was conditional on fixation of the target object, so non-fixations did not contribute to the computed means. The measures were subdivided into two groups:
  •  
    measures that reflect properties of single fixations and
  •  
    measures that reflect aggregate fixations on objects.
The means for these measures are reported in Table 2.
Table 2
 
Measures of eye movement behavior on discrete objects in complex real-world scenes as a function of viewing task.
Table 2
 
Measures of eye movement behavior on discrete objects in complex real-world scenes as a function of viewing task.
Memorization task Visual search task Difference
Mean SE Mean SE
Proportion of objects fixated 0.66 0.020 0.53 0.024 0.13**
Average saccade amplitude to object (deg) 3.63 0.10 3.80 0.07 −0.17
Average fixation duration (ms) 290 9.08 279 6.09 11
Average first fixation duration (ms) 286 8.82 271 6.13 14
First gaze duration (ms) 439 11.40 348 8.90 91**
First gaze fixation count 1.6 0.05 1.3 0.04 0.3*
Total time (ms) 830 30.40 644 18.50 185**
Total number of fixations 2.7 0.07 2.2 0.05 0.53**
 

Note: * p < 0.01; ** p < 0.001.

Proportion of objects fixated
Of the three objects defined in each scene, we calculated the proportion of objects that were fixated at least once as a function of task. We found that the proportion of objects fixated was 13% greater in the Memorization than Visual Search task, t(19) = −4.23, p < 0.001. The result is consistent with the spatial distribution of fixation measure described above showing a greater distribution of fixations in the Memorization than Visual Search task. 
Average saccade amplitude to object
In order to look at possible effects of task on saccade amplitude, we also calculated the average saccade amplitude that preceded the first fixation on an object. We found that the 0.17° difference between tasks was not significant, t(19) = 1.07, ns
Average fixation duration and first fixation duration
The average fixation duration was only 11 ms longer in the Memorization than Visual Search task and not statistically significant, t(19) = −1.25, ns. We also found for the first fixation durations that the 14-ms difference between the tasks was not significantly different, t(19) = −1.43, ns
First gaze duration
Initial processing of the objects was also measured by looking at the first gaze duration. First gaze duration is defined as the sum of all fixations made within the defined object region before the eyes fixate another location outside the region. We found that the average first gaze duration on objects was 91 ms longer in the Memorization than Visual Search task, t(19) = −7.27, p < 0.001. 
First gaze fixation count
In addition to duration, initial fixation density on individual objects during the initial processing was measured by looking at the first gaze fixation count. First gaze fixation count is defined as the initial number of fixations made within the defined object region before the eyes fixate another region. We found that the average first gaze fixation count was 1.6 fixation for the Memorization task and 1.3 for the Visual Search task, t(19) = −5.73, p < 0.01. 
Total time on objects
The Total Time on objects was calculated by summing all fixations on an object. We found that total time spent on objects was 185 ms longer when participants were performing the Memorization than when performing the Visual Search task, t(19) = −4.77, p < 0.001. 
Total number of fixations on objects
Related to the Total Time, we also looked at the average total number of fixations on an object over the whole viewing period. We found that the number of total fixations was greater for the Memorization than Visual Search task, t(19) = −7.05, p < 0.001, which is consistent with the other aggregate eye movement measures reported above. 
General discussion
The present study investigated how task instruction affects eye movement patterns during the viewing of a scene. The effect of task on eye movement patterns has been long established by pioneers of research into eye movement patterns (Buswell, 1935; Yarbus, 1967). However, as convincing as they are, the accounts of these effects are descriptive, fixation data are depicted as images, and the results lack quantification. We sought to provide quantitative analyses, with a specific emphasis on investigating the nature of fixation durations in addition to their placement. We found that task effects are observed at both the scene and object level of analysis. We also found that task affected both the placement and fixation duration patterns during scene viewing. 
Task effects on eye movements across the whole scene
At the level of the whole scene, fixations were more distributed in the memorization condition and more focused on search-relevant regions in the search condition, which directly replicate the findings of Buswell (1935) and Yarbus (1967). This is not a surprising finding when we consider the strategies involved for each task. In the Memorization task, participants were told that they would be tested on specific objects within the scene, and so to improve encoding of the different objects, it makes sense that they would try to fixate as many different objects as possible. This pattern of spreading fixations over many different objects was also reflected to some extent in the total scan pattern and to a greater extent in the greater total number of fixations; that is, there was a numerically longer scan pattern and a higher count of fixations in the Memorization than in Visual Search task. In the Visual Search task, fixations were more narrowly focused within the scene, and we can assume that participants limited their fixations to areas that most likely contained the target. This finding is consistent with other studies showing that context information leads to more efficient searches (Brockmole, Castelhano, & Henderson, 2006; Brockmole & Henderson, 2006, 2008; Castelhano & Henderson, 2007; Chun & Jiang, 1998; Neider & Zelinsky, 2006). Furthermore, this effect of context has been implemented in a recent computational model by Torralba et al. (2006), which showed that participants' fixations largely remained within scene areas that were statistically most likely to contain the target object. 
We found that for both tasks average fixation durations increased as viewing time increased (for the first 5 fixations) and then remained stable in the later viewing period. This finding is consistent with earlier studies that reported similar patterns (Antes, 1974; Friedman & Liebelt, 1981; Unema et al., 2005). We found that the fixation durations stopped increasing after only ∼2 s. This steep increase during the first seconds of viewing is also found in other studies in which the task demands rely on the quality of the initial performance (e.g., ∼3.4 s in Unema et al., 2005). One could conclude based on the fixation duration data that the initial scanning of the scenes did not differ between tasks. However, the saccade amplitude measure across the whole scene seems to point to a different pattern, which we will turn to now. 
When we examined average saccade amplitude, there were no systematic differences between tasks; however, there were differences in the saccade amplitude during the initial viewing of the scene. We found that participants made longer saccades during initial viewing in Visual Search versus Memorization, while later saccades did not differ across tasks. Again, this difference can be attributed to the strategies that participants are implementing as they examine the scene in each task condition. However, it is not clear whether the difference in saccade amplitude is due to the participants staying closer to the center during the initial encoding for the Memorization task, or whether they are simply scanning the whole scene more thoroughly in the Visual Search task. It is not clear what a proper baseline for these tasks would be, but in a preference rating task, Antes (1974) found that the average saccade amplitudes seem to decrease with increased viewing time. If we see this as the default (an initial wide scanning of the scene with the first few seconds of viewing), then it may be that when a memorization strategy is implemented the system can immediately start to examine details without the need for an initial wide scanning of the scene. This finding is in direct contrast to other studies that report that the first few fixations are controlled by stimulus factors alone (De Graef et al., 1990; Mannan, Ruddock, & Wooding, 1995). For instance, Mannan et al. (1995) measured eye movements while viewers examined grayscale photographs for 3 s each. The photographs were either high-pass filtered, low-pass filtered, or unfiltered. Results showed that fixation positions were similar on the unfiltered and low-pass filtered scenes during the first 1.5 s of viewing. However, as noted by Henderson and Ferreira (2004), even if eye movement control is largely determined by stimulus factors during the initial scanning of the scene, this does not prevent fixations from being influenced by task. In the present study, the immediate implementation of the memorization strategy is also seen with the elapsed time to the first saccade and is discussed further below. 
The elapsed time to the execution of the first saccade was much longer for the memorization task than the visual search task. This elapsed time until the first fixation (or the initial fixation at scene onset) is theoretically different from other fixations made on the scene because it involves identifying the scene being presented (Castelhano & Henderson, 2007, 2008a; Potter, 1976; Schyns & Oliva, 1994), as well as deciding and planning where to target the next fixation (Castelhano & Henderson, 2007). The additional 48 ms in the Memorization task, in addition to the shorter initial saccade amplitudes discussed above, suggest that the effect of task was immediate. That is, the largest differences between the tasks were seen within the first few seconds of viewing with both saccade amplitude and fixation duration becoming similar in the latter part of the viewing period. This immediate effect is interesting in light of other top-down influences, such as the effect of scene context on the examination of objects within the scene (De Graef et al., 1990; Henderson et al., 1999; Rayner, Castelhano, & Yang, 2009), which seem to only emerge in later viewing. Top-down effects due to scene semantics seem to take a while to onset (relative to the whole viewing period), while top-down effects of task are seen immediately and seem to be more pronounced in the first few seconds of viewing. 
Task effects on eye movements on objects
To better understand the effect of tasks on the examination of objects within the scenes, we also looked at fixation patterns on objects. As would be expected from the effect of task on the distribution of fixations, we also found that participants tended to examine more objects in the memorization task condition. However, theoretically more interesting is the failure to observe an effect of task on the average fixation duration. The reason this is interesting or even surprising is that the lack of an effect of the task goes against the findings in the reading literature, in which effects of task, context, word difficulty, and word length are seen at the level of the average fixation duration (Rayner, 1998). Instead, we found that task affected gaze duration by modifying the number of fixations within a gaze on a given object. The same pattern was also seen across other aggregate measures of eye movements on the objects viewed in the scenes. This finding is consistent with the failure to observe effects of other factors on individual fixation durations during scene viewing (see Henderson & Ferreira, 2004; Henderson & Hollingworth, 1998, for review; but see Henderson & Pierce, 2008; Henderson & Smith, in press). 
In general, participants tended to spend more time fixating objects in the Memorization task than in the Visual Search task. However, this was seen in the number of times that the objects were fixated, not in the average fixation duration. This finding is consistent with an earlier study by Loftus (1972) that reported memory for scene regions was not related to the average fixation duration but rather to the number of fixations made on the region. The finding that the number of fixations is greater for memory than the visual search task can be easily attributed as a system level strategy by which visual information in the memory task is encoded more thoroughly. However, because an equivalent effect is not found at the level of the fixation duration may indicate a limit in the architecture governing the decision of when to move the eyes. Rather than influencing when the eyes move, the effect of the task on scenes was observed in regards to where the eyes move. 
When to move the eyes during scene perception
Based on Morrison's (1984) reading model, researchers (Henderson, 1992; Rayner, 1998) have suggested that the decision of when to move the eyes during scene viewing is based on the processing of currently fixated visual information to a certain level. In reading, that level is thought to be lexical access of the word (Reichle et al., 2003), whereas in scene viewing, it is proposed to be the recognition of the object at fixation (Henderson, 1992). In a recent set of studies, Henderson and colleagues (Henderson & Pierce, 2008; Henderson & Smith, in press) investigated the degree to which the currently fixated visual information affects fixation durations on scenes. By masking the stimulus at the end of a saccade, the availability of the scene information was delayed. The rationale (based on Rayner & Pollatsek, 1981) was that if fixation durations depend on the information currently being encoded, then the fixation durations should increase in proportion to the delay of the stimulus onset. Results showed that although there was a subpopulation of fixations whose durations were not affected by the delay, for a second population of fixations there was a substantial link between the availability of the fixated information and the duration of the fixations. The authors conclude that fixation duration in scene viewing is partially controlled by the immediately available information from the scene. 
The effect of the information at fixation on duration has also been reported in other scene viewing studies. For example, researchers have found that stimulus factors such as semantically inconsistent objects result in longer fixation durations (Friedman, 1979; Henderson et al., 1999). These longer fixation durations have been attributed to the requirement of more visual analysis of local details for inconsistent objects and to consolidation of the inconsistent information into the familiar scene gist or schema. So, for semantic anomalies, fixation durations are thought to be directly affected by higher order cognitive processes (Henderson & Hollingworth, 1999). 
However, until now no study (that we are aware) has directly investigated the effects of task on individual fixations. In a study by Henderson et al. (1999), task instructions were manipulated across experiments, and on average, fixation durations were numerically longer (38 ms) in the Memorization experiment than in the Visual Search experiment. In the present study, we found an 11-ms difference that was not significantly different. The earlier study was not designed to directly compare the effects of task on fixation duration and no gaze duration measures were reported (due to the fact that when fixations landed on or near the target in the visual search task, the participant would terminate the trial). So, from this previous study, it is difficult to determine what role task instruction played in controlling fixation and gaze durations during scene viewing. 
In the current study, we found that task influences gaze duration but not average fixation duration. This finding of longer gaze durations is similar to higher order cognitive effects mentioned above (i.e., semantic inconsistency); time spent examining objects in a memorization task is greater than the time spent during a visual search, just as time fixating a semantically inconsistent object is greater than a semantically consistent one. Based on the notion that fixation durations are determined by the amount of processing completed (Henderson, 1992; Rayner, 1998), it is possible to posit why there is a qualitative difference between the increase in examining individual objects within scenes for inconsistent semantic information vs. task instruction information. For the inconsistent information, fixations durations could be longer either because object recognition processes for unexpected objects require more time, or because memory consolidation for unexpected objects takes longer (Henderson, 1992). Slowing down of object recognition does not apply to the present study. However, despite being easily recognized, a successful implementation of a memorization task instruction requires that object information be properly consolidated to ensure that the information is available for the memory test. Thus, to increase the time spent on the currently fixated information, the system simply plans a refixation of the object when the criteria for saccade initiation (e.g., object recognition) are met. 
Conclusions
We have attempted to outline in this paper the specific effects of task on eye movement control during examination of real-world scenes. This research has highlighted some differences from reading that are perhaps unexpected. Namely, the influence of task is found at the level of aggregate eye movement measures and not at the level of individual fixations. As discussed above, although this finding is consistent with some other studies on scene perception, it is inconsistent with other studies examining higher order processing. For instance, individual fixation durations are thought to be affected by semantic anomalies. Here, we found that for a memorization task, in which performance would benefit from longer fixation times, the individual fixation durations did not differ from a visual search task. This is the most intriguing finding of the present study and more research is needed to differentiate between these different effects of higher order cognitive processing on the control of gaze. For instance, in the case of some ambiguous visual information where the task is to memorize versus search, will fixation durations then reflect differences in task? Previous studies suggest that longer fixation durations reflect the need for more visual analysis of local details or for consolidation of inconsistent information into a schema. Building on the present study, we believe that manipulating task and stimulus properties would give rise to valuable information on the control of eye movements and the architecture of the visual system. 
Acknowledgments
This work was supported by grants from the National Science Foundation (BCS-0094433 and ECS-9873531), NSF IGERT Program (DGE0114378), the Army Research Office (W911NF-04-1-0078), and the Economic and Social Research Council of the UK (RES-062-23-1092) to JMH and by a grant from the Natural Sciences and Engineering Research Council of Canada to MSC. We thank George Malcolm for comments on an earlier draft of this article, and Eyal Reingold and another anonymous reviewer for helpful comments. 
Commercial relationships: none. 
Corresponding author: Monica Castelhano. 
Email: monica.castelhano@queensu.ca. 
Address: Department of Psychology, Queen's University, Kingston, ON, K7L 3N6, Canada. 
Footnotes
Footnotes
1  We would like to thank Eyal Reingold for this suggestion.
Footnotes
2  We thank an anonymous reviewer for making this suggestion.
Footnotes
3  For both the fixation duration and saccade amplitude ordinal analyses, fixations 20–24 were analyzed as the last 5 fixations (ordinal fixations greater than 24 were much less frequent for the Visual Search task than the Memorization task).
References
Antes, J. R. (1974). The time course of picture viewing. Journal of Experimental Psychology, 103, 62–70. [PubMed] [CrossRef] [PubMed]
Brockmole, J. R. Castelhano, M. S. Henderson, J. M. (2006). Contextual cueing in naturalistic scenes: Global and local contexts. Journal of Experimental Psychology: Learning, Memory, and Cognition, 32, 699–706. [PubMed] [CrossRef] [PubMed]
Brockmole, J. R. Henderson, J. M. (2006). Using real-world scenes as contextual cues for search. Visual Cognition, 13, 99–108. [CrossRef]
Brockmole, J. R. Henderson, J. M. (2008). Prioritizing new objects for eye fixation in real-world scenes: Effects of object-scene consistency. Visual Cognition, 16, 375–390. [CrossRef]
Buswell, G. (1935). How people look at pictures. Oxford, England: University of Chicago Press.
Castelhano, M. S. Henderson, J. M. (2005). Incidental visual memory for objects in scenes. Visual Cognition, 12, 1017–1040. [CrossRef]
Castelhano, M. S. Henderson, J. M. (2007). Initial scene representations facilitate eye movement guidance in visual search. Journal of Experimental Psychology: Human Perception and Performance, 33, 753–763. [PubMed] [CrossRef] [PubMed]
Castelhano, M. S. Henderson, J. M. (2008a). Stable individual differences across images in human saccadic eye movements. Canadian Journal of Experimental Psychology, 62, 1–14. [PubMed] [CrossRef]
Castelhano, M. S. Henderson, J. M. (2008b). The influence of color on the perception of scene gist. Journal of Experimental Psychology: Human Perception and Performance, 34, 660–675. [PubMed] [CrossRef]
Castelhano, M. S. Rayner, K. Rayner,, K. Shem,, D. Bai,, X. Yan, G. (2008). Eye movements during reading, visual search, scene perception: An overview. Cognitive and cultural influences on eye movements. (pp. 175–195). Tianjin: Tianjin People's Publishing House / Hovee
Chun, M. M. Jiang, Y. (1998). Contextual cueing: Implicit learning and memory of visual context guides spatial attention. Cognitive Psychology, 36, 28–71. [PubMed] [CrossRef] [PubMed]
Crane, H. D. Kelley, D. H. (1994). The Purkinje image eyetracker, image stabilization and related forms of stimulus manipulation. Visual science and engineering: Models and applications. (pp. 15–89). New York: Marcel Dekker.
De Graef, P. Christiaens, D. d'Ydewalle, G. (1990). Perceptual effects of scene context on object identification. Psychological Research, 52, 317–329. [PubMed] [CrossRef] [PubMed]
Engbert, R. Longtin, A. Kliegl, R. (2002). A dynamical model of saccade generation in reading based on spatially distributed lexical processing. Vision Research, 42, 621–636. [PubMed] [CrossRef] [PubMed]
Engbert, R. Nuthmann, A. Richter, E. M. Kliegl, R. (2005). SWIFT: A dynamical model of saccade generation during reading. Psychological Review, 112, 777–813. [PubMed] [CrossRef] [PubMed]
Findlay, J. M. Gilchrist, I. D. Jenkins, M. Harris, L. (2001). Visual attention: The active vision perspective. Vision and attention. (pp. 83–103). New York: Springer-Verlag.
Friedman, A. (1979). Framing pictures: The role of knowledge in automatized encoding and memory for gist. Journal of Experimental Psychology: General, 108, 316–355. [PubMed] [CrossRef] [PubMed]
Friedman, A. Liebelt, L. S. Fisher,, D. F. Monty,, R. A. Senders, J. W. (1981). On the time course of viewing pictures with a view towards remembering. Eye movements: Cognition and visual perception. (pp. 137–155). Hillsdale, NJ: Erlbaum.
Henderson, J. M. Rayner, K. (1992). Visual attention and eye movement control during reading and picture viewing. Eye movements and visual cognition: Scene perception and reading. (pp. 260–283). New York: Springer-Verlag.
Henderson, J. M. (2003). Human gaze control during real-world scene perception. Trends in Cognitive Sciences, 7, 498–504. [PubMed] [CrossRef] [PubMed]
Henderson, J. M. (2007). Regarding scenes. Current Directions in Psychological Science, 16, 219–222. [CrossRef]
Henderson, J. M. Brockmole, J. R. Castelhano, M. S. Mack, M. van, R. Fischer,, M. Murray,, W. Hill, R. (2007). Visual saliency does not account for eye movements during visual search in real-world scenes. Eye movements: A window on mind and brain. (pp. 537–562). Oxford, UK: Elsevier.
Henderson, J. M. Ferreira, F. Henderson, J. M. Ferreira, F. (2004). Scene perception for psycholinguists. The interface of language, vision, and action: Eye movements and the visual world. (pp. 1–58). New York: Psychology Press.
Henderson, J. M. Hollingworth, A. Underwood, G. (1998). Eye movements during scene viewing: An overview. Eye guidance while reading and while watching dynamic scenes. (pp. 269–293). Oxford, UK: Elsevier.
Henderson, J. M. Hollingworth, A. (1999). High-level scene perception. Annual Review of Psychology, 50, 243–271. [PubMed] [CrossRef] [PubMed]
Henderson, J. M. McClure, K. K. Pierce, S. Schrock, G. (1997). Object identification without foveal vision: Evidence from an artificial scotoma paradigm. Perception & Psychophysics, 59, 323–346. [PubMed] [CrossRef] [PubMed]
Henderson, J. M. Pierce, G. L. (2008). Eye movements during scene viewing: Evidence for mixed control of fixation durations. Psychonomic Bulletin & Review, 15, 566–573. [PubMed] [CrossRef] [PubMed]
Henderson, J. M. Smith, T. J. (in press). Visual Cognition.
Henderson, J. M. Weeks, Jr., P. A. Hollingworth, A. (1999). Effects of semantic consistency on eye movements during scene viewing. Journal of Experimental Psychology: Human Perception and Performance, 25, 210–288. [CrossRef]
Itti, L. Koch, C. (2001). Computational modelling of visual attention. Nature Reviews Neuroscience, 2, 194–203. [PubMed] [CrossRef] [PubMed]
Kliegl, R. Engbert, R. Hyn,, J. Radach,, R. Deubel, H. (2003). SWIFT explorations. The mind's eye: Cognitive and applied aspects of eye movement research. (pp. 391–411). Amsterdam: Elsevier Science.
Koch, C. Ullman, S. (1985). Shifts in selective visual attention: Towards the underlying neural circuitry. Human Neurobiology, 4, 219–227. [PubMed] [PubMed]
Krieger, G. Rentschler, I. Hauske, G. Schill, K. Zetzsche, C. (2000). Object and scene analysis by saccadic eye-movements: An investigation with higher-order statistics. Spatial Vision, 13, 201–214. [PubMed] [CrossRef] [PubMed]
Loftus, G. R. (1972). Eye fixations and recognition memory for pictures. Cognitive Psychology, 3, 525–551. [CrossRef]
Loftus, G. R. Mackworth, N. H. (1978). Cognitive determinants of fixation location during picture viewing. Journal of Experimental Psychology: Human Perception and Performance, 4, 565–572. [PubMed] [CrossRef] [PubMed]
Mackworth, N. H. Morandi, A. J. (1967). The gaze selects informative details within pictures. Perception & Psychophysics, 2, 547–552. [CrossRef]
Mannan, S. Ruddock, K. H. Wooding, D. S. (1995). Automatic control of saccadic eye movements made in visual inspection of briefly presented 2-D images. Spatial Vision, 9, 363–386. [PubMed] [CrossRef] [PubMed]
Mannan, S. K. Ruddock, K. H. Wooding, D. S. (1996). The relationship between the locations of spatial features and those of fixations made during visual examination of briefly presented images. Spatial Vision, 10, 165–188. [PubMed] [CrossRef] [PubMed]
Mannan, S. K. Ruddock, K. H. Wooding, D. S. (1997). Fixation sequences made during visual examination of briefly presented 2D images. Spatial Vision, 11, 157–178. [PubMed] [CrossRef] [PubMed]
Masson, M. E. (1983). Conceptual processing of text during skimming and rapid sequential reading. Memory & Cognition, 11, 262–274. [PubMed] [CrossRef] [PubMed]
Morrison, R. E. (1984). Manipulation of stimulus onset delay in reading: Evidence for parallel programming of saccades. Journal of Experimental Psychology: Human Perception & Performance, 10, 667–682. [PubMed] [CrossRef]
Neider, M. B. Zelinsky, G. J. (2006). Scene context guides eye movements during visual search. Vision Research, 46, 614–621. [PubMed] [CrossRef] [PubMed]
Parkhurst, D. Law, K. Niebur, E. (2002). Modeling the role of salience in the allocation of overt visual attention. Vision Research, 42, 107–123. [PubMed] [CrossRef] [PubMed]
Parkhurst, D. J. Niebur, E. (2003). Scene content selected by active vision. Spatial Vision, 6, 125–154. [PubMed] [CrossRef]
Pomplun, M. Ritter, H. Velichkovsky, B. (1996). Disambiguating complex visual information: Towards communication of personal views of a scene. Perception, 25, 931–948. [PubMed] [CrossRef] [PubMed]
Potter, M. C. (1976). Short-term conceptual memory for pictures. Journal of Experimental Psychology: Human Learning and Memory, 2, 509–522. [PubMed] [CrossRef] [PubMed]
Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124, 372–422. [PubMed] [CrossRef] [PubMed]
Rayner, K. Castelhano, M. S. Yang, J. (2009). Eye movements when looking at unusual/weird scenes: Are there cultural differences? Journal of Experimental Psychology: Learning, Memory & Cognition, 35, 254–259. [CrossRef]
Rayner, K. Pollatsek, A. (1981). Eye movement control during reading: Evidence for direct control. Quarterly Journal of Experimental Psychology A: Human Experimental Psychology, 33, 351–373. [PubMed] [CrossRef]
Reichle, E. D. Pollatsek, A. Fisher, D. L. Rayner, K. (1998). Toward a model of eye movement control in reading. Psychological Review, 105, 125–157. [PubMed] [CrossRef] [PubMed]
Reichle, E. D. Pollatsek, A. Rayner, K. Van, R. P. G. Fischer,, M. F. Murray,, W. S. Hill, R. L. (2006). Modeling the effects of lexical ambiguity on eye movements during reading. Eye movements: A window on mind and brain. (pp. 271–292). Oxford, England: Elsevier.
Reichle, E. D. Rayner, K. Pollatsek, A. (2003). The E-Z reader model of eye‐movement control in reading: Comparisons to other models. Behavioral and Brain Sciences, 26, 445–476. [PubMed] [CrossRef] [PubMed]
Reinagel, P. Zador, A. M. (1999). Natural scene statistics at the centre of gaze. Network, 10, 341–350. [PubMed] [CrossRef] [PubMed]
Schyns, P. G. Oliva, A. (1994). From blobs to boundary edges: Evidence for time and spatial scale dependent scene recognition. Psychological Science, 5, 195–200. [CrossRef]
Tatler, B. W. Baddeley, R. J. Gilchrist, I. D. (2005). Visual correlates of fixation selection: Effects of scale and time. Vision Research, 45, 643–659. [PubMed] [CrossRef] [PubMed]
Torralba, A. Oliva, A. Castelhano, M. S. Henderson, J. M. (2006). Contextual guidance of eye movements and attention in real-world scenes: The role of global features in object search. Psychological Review, 113, 766–786. [PubMed] [CrossRef] [PubMed]
Unema, P. J. A. Pannasch, S. Joos, M. Velichkovsky, B. M. (2005). Time course of information processing during scene perception: The relationship between saccade amplitude and fixation duration. Visual Cognition, 12, 473–494. [CrossRef]
van Diepen, E. M. J. d'Ydewalle, G. (2003). Early peripheral and foveal processing in fixations during scene perception. Visual Cognition, 10, 79–100. [CrossRef]
Yarbus, I. A. (1967). Eye movements and vision.
Figure 1
 
Typical viewing patterns for two participants looking at a scene in the (A) Memorization and (B) Visual Search instruction conditions, respectively. The participants were asked to look for a bucket in the visual search task.
Figure 1
 
Typical viewing patterns for two participants looking at a scene in the (A) Memorization and (B) Visual Search instruction conditions, respectively. The participants were asked to look for a bucket in the visual search task.
Figure 2
 
The distribution of all fixations of all participants over the same scene. (Top) The original image viewed by participants. (Bottom left) The image showing placement of all fixation in the Memorization task condition. (Bottom right) The image showing fixation placement in the Visual Search task condition. As can be seen qualitatively in this figure, fixations tended to be more distributed in the memory condition and more focused on search-relevant regions in the search condition. In this image, participants were asked to look for a bucket and fixations were concentrated in the windows of the hardware store (see text for more information on how these images were created).
Figure 2
 
The distribution of all fixations of all participants over the same scene. (Top) The original image viewed by participants. (Bottom left) The image showing placement of all fixation in the Memorization task condition. (Bottom right) The image showing fixation placement in the Visual Search task condition. As can be seen qualitatively in this figure, fixations tended to be more distributed in the memory condition and more focused on search-relevant regions in the search condition. In this image, participants were asked to look for a bucket and fixations were concentrated in the windows of the hardware store (see text for more information on how these images were created).
Figure 3
 
The fixation duration distribution for each task.
Figure 3
 
The fixation duration distribution for each task.
Figure 4
 
(A) The fixation duration by ordinal fixation number for each task. (B) The saccade amplitude by ordinal fixation number for each task. In order to quantify the differences, the first five saccades were analyzed for the ordinal fixation durations (see text for more details).
Figure 4
 
(A) The fixation duration by ordinal fixation number for each task. (B) The saccade amplitude by ordinal fixation number for each task. In order to quantify the differences, the first five saccades were analyzed for the ordinal fixation durations (see text for more details).
Figure 5
 
The saccade amplitude distribution for each task.
Figure 5
 
The saccade amplitude distribution for each task.
Figure 6
 
A sample scene in which three isolated objects were defined in order to analyze differences in eye movement behavior on discrete objects. The only criterion for each scene was that the three chosen objects were not occluded.
Figure 6
 
A sample scene in which three isolated objects were defined in order to analyze differences in eye movement behavior on discrete objects. The only criterion for each scene was that the three chosen objects were not occluded.
Table 1
 
Global measures of eye movement behavior in complex real world as a function of viewing task.
Table 1
 
Global measures of eye movement behavior in complex real world as a function of viewing task.
Memorization task Visual search task Difference
Mean SE Mean SE
Percentage of scene area fixated 48% 0.08% 37% 0.09% 11%*
Total scan path length 82° 2.2° 74° 2.4° 8°**
Total number of fixations 27.6 0.09 24.2 0.08 3.4**
Average fixation duration (ms) 287 0.29 292 0.24 −5
Average saccade amplitude (deg) 3.0° 0.03 3.1° 0.03 −0.1
Elapsed time to first saccade execution (ms) 317 9.73 269 7.14 48**
 

Note: * p < 0.05; ** p < 0.01.

Table 2
 
Measures of eye movement behavior on discrete objects in complex real-world scenes as a function of viewing task.
Table 2
 
Measures of eye movement behavior on discrete objects in complex real-world scenes as a function of viewing task.
Memorization task Visual search task Difference
Mean SE Mean SE
Proportion of objects fixated 0.66 0.020 0.53 0.024 0.13**
Average saccade amplitude to object (deg) 3.63 0.10 3.80 0.07 −0.17
Average fixation duration (ms) 290 9.08 279 6.09 11
Average first fixation duration (ms) 286 8.82 271 6.13 14
First gaze duration (ms) 439 11.40 348 8.90 91**
First gaze fixation count 1.6 0.05 1.3 0.04 0.3*
Total time (ms) 830 30.40 644 18.50 185**
Total number of fixations 2.7 0.07 2.2 0.05 0.53**
 

Note: * p < 0.01; ** p < 0.001.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×