Free
Article  |   January 2012
The mask-onset delay paradigm and the availability of central and peripheral visual information during scene viewing
Author Affiliations
Journal of Vision January 2012, Vol.12, 9. doi:10.1167/12.1.9
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Mackenzie G. Glaholt, Keith Rayner, Eyal M. Reingold; The mask-onset delay paradigm and the availability of central and peripheral visual information during scene viewing. Journal of Vision 2012;12(1):9. doi: 10.1167/12.1.9.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

We employed a variant of the mask-onset delay paradigm in order to limit the availability of visual information in central and peripheral vision within individual fixations during scene viewing. Subjects viewed full-color scene photos with instructions to search for a target object (Experiment 1) or to study them for a later memory test (Experiment 2). After a fixed interval following the onset of each eye fixation (50–100 ms), the scene was scrambled either in the central visual field or over the entire display. The intact scene was presented when the subject made an eye movement. Our results reconcile different sets of findings from prior research regarding the masking of central and peripheral visual information at different intervals following fixation onset. In particular, we found that when the entire display was scrambled, both search and memory performance were impaired even at relatively long mask-onset intervals. In contrast, when central vision was scrambled, there were subtle impairments that depended on the viewing task. In the 50-ms mask-onset interval, subjects were selectively impaired at identifying, but not in locating, the search target (Experiment 1), while memory performance (Experiment 2) was unaffected in this condition, and hence, the reliance on central and peripheral visual information depends partly on the viewing task.

Introduction
While performing visual tasks such as reading, visual search, and scene perception, human observers make eye movements, known as saccades, in order to align their high-acuity central vision with different areas of the visual field. While the eye is in motion during a saccade, the extraction of information from the visual field is largely suppressed. Between saccades, the eye is relatively still (fixated) and it is during these fixation events that visual information is extracted (for reviews, see Rayner, 1998, 2009). While it has been demonstrated that visual information can be extracted extremely rapidly during fixations, there is some controversy regarding the extent to which this depends on both the type of information that is fixated (e.g., words compared to objects or natural images) and the viewing task that the observer is undertaking (e.g., reading compared to visual search or scene viewing). In particular, recent studies in scene perception have produced a mixed pattern of results with regard to the time required within individual fixations to adequately encode the visual information from scenes. This discrepancy is the focus of the present study. Accordingly, we begin with a review of prior findings regarding the encoding of visual information during scene perception. In particular, we focus on studies that employed gaze-contingent masking of scene information to estimate the time required within individual fixations to encode visual information from scenes (variants of the mask-onset delay paradigm). Next, we outline the rationale for the present experiments, which employed a version of the mask-onset delay paradigm (Rayner, Smith, Malcolm, & Henderson, 2009; van Diepen, De Graef, & d'Ydewalle, 1995; van Diepen, Ruelens, & d'Ydewalle, 1999) in order to limit the availability of stimulus information within eye fixations either in central vision or over the entire scene. 
Research on reading has shown that reading can proceed quite normally even when the words are visible for only a very short duration within a fixation (Ishida & Ikeda, 1989; Liversedge et al., 2004; Rayner, Inhoff, Morrison, Slowiaczek, & Bertera, 1981; Rayner, Liversedge, & White, 2006; Rayner, Liversedge, White, & Vergilino-Perez, 2003; Slowiaczek & Rayner, 1987). These studies employed a gaze-contingent display change methodology in which words were masked or removed a short interval after they were fixated (e.g., 50–60 ms). Despite such an extreme restriction on the availability of visual information from the fixated word (consider that readers typically fixate a word for 200–300 ms), readers are still able to encode the fixated words and extract the meaning of the text they are reading under such conditions. This demonstrates that the visual information required to identify words can be extracted extremely rapidly and very early within an eye fixation. 
While the rapid encoding of visual information in reading may be facilitated by the relatively shallow visual complexity of printed words, it has also been demonstrated that certain information can be extracted extremely rapidly from more complex visual stimuli, such as scenes. For example, a large body of research has shown that basic scene category information (often called “gist”) can be extracted from just a very brief exposure (Biederman, 1981; Biederman, Mezzanotte, & Rabinowitz, 1982; Castelhano & Henderson, 2007, 2008; Fei-Fei, Iyer, Koch, & Perona, 2007; Greene & Oliva, 2009; Intraub, 1980; Joubert, Rousselet, Fize, & Fabre-Thorpe, 2007; Loschky et al., 2007; Oliva & Schyns, 1997; Potter, 1975; Schyns & Oliva, 1994; Thorpe, Fize, & Marlot, 1996; Van Rullen & Thorpe, 2001; Võ & Henderson, 2010; Võ & Schneider, 2010; for reviews, see Henderson & Ferreira, 2004; Henderson & Hollingworth, 1999). It has been shown that the gist of a scene can be extracted with exposure durations as short as 50 ms (Castelhano & Henderson, 2008; Loschky et al., 2007). However, other studies have suggested that gist extraction can take at least 75 ms (Võ & Henderson, 2010) or more than 100 ms (Potter, 1975). 
Prior studies on the extraction of gist information from scenes have tended to employ paradigms in which the scene is presented suddenly while the viewer is presumably in fixation, and the scene is subsequently masked by other scenes (i.e., rapid serial presentation), or by a noise mask, after a given exposure interval. Other researchers have taken a different approach by trying to limit the amount of time that scene information is presented within individual eye fixations. This approach is potentially more powerful because it allows the researcher to study the processing of scene information in situations that more closely resemble normal scene viewing. Specifically, during normal scene viewing, viewers tend to make multiple eye fixations, presumably in order to acquire scene information beyond mere gist (e.g., to search for a particular object within the scene or to commit the scene to memory). A gaze-contingent methodology known as the mask-onset delay paradigm has been developed in order to examine scene perception under these more natural viewing conditions. In this paradigm, eye movements are monitored and a computer display is updated such that scene information is masked 1 a certain interval after the onset of each eye fixation (Rayner et al., 2009; van Diepen et al., 1995, 1999; for a related paradigm known as the scene-onset delay paradigm, see Henderson & Pierce, 2008; Henderson & Smith, 2009; van Diepen & d'Ydewalle, 2003). This type of paradigm is akin to the gaze-contingent word masking paradigms that were developed in the context of reading research (e.g., Ishida & Ikeda, 1989; Liversedge et al., 2004; Rayner et al., 1981, 2006, 2003; Rayner & Pollatsek, 1981; Slowiaczek & Rayner, 1987). 
In an early application of this methodology to study scene perception, van Diepen et al. (1995) had viewers examine black and white line drawings of scenes containing a collection of objects and non-objects. Viewers were required to search the scene and count the number of unfamiliar objects (non-objects) that were present. Visual information within the 2–3 degrees of visual angle surrounding the fixation point was replaced by a noise mask after a short interval following fixation onset (either 15, 45, 75, or 120 ms). By comparing the masking conditions to a free viewing condition in which there was no masking, van Diepen et al. (1995) were able to gauge the effect of the different mask-onset intervals on scene viewing performance. They found that subjects' viewing times were only significantly affected at the shortest mask-onset interval (15 ms), indicating that visual information sufficient for normal task performance could be extracted from the images in an interval as short as only 45 ms following fixation onset. In a follow-up study, van Diepen et al. (1999) asked whether in this scene search task, the extraction of scene information occurs very early during a fixation or if it could also be extracted during the later portion of the fixation. To address this, van Diepen et al. (1999) employed a paradigm where visual information in the foveated region was masked for a fixed interval (83 ms) and the onset of this masking interval occurred either early in the fixation or at progressively later intervals (either 15, 35, 60, or 85 ms after fixation onset). Consistent with the hypothesis that visual information is extracted rapidly and early within a fixation, the mask had a large effect on viewing time in the earliest mask onset condition but not in the later onset conditions. 
The findings of van Diepen et al. (1995, 1999) suggest that detailed object information can be extracted from scenes during a very brief exposure within an eye fixation. However, a more recent study produced a conflicting pattern of findings. Rayner et al. (2009) had subjects view full-color real-world scenes while searching for a particular target object (scene search) or while memorizing the scene for a later recognition memory test (scene memory). Using the mask-onset delay paradigm, the entire scene was masked a certain interval after the onset of fixations (either 25, 50, 75, 100, 150, 200, or 250 ms). In contrast to the findings of van Diepen et al., task performance in both viewing tasks was impaired compared to free viewing (i.e., no masking) for mask-onset intervals that were less than 150 ms. Rayner et al. intimated that this discrepancy might be related to differences in the scene stimuli used in the two studies. In particular, the scene information in the full-color scenes they used might be expected to take longer to encode compared to the information contained in the line-drawn black and white scenes used by van Diepen et al. However, there were also other differences between the studies. Most importantly, in van Diepen et al.'s studies, foveal vision was masked, while in Rayner et al.'s study the entire scene was masked. 
In addition to measuring overall task performance during scene viewing, both Rayner et al. and van Diepen et al. looked for evidence of changes in the patterns of eye movements as a function of masking condition. For example, it might be the case that even though scene viewing performance is unaffected in a masking condition, the eye movement record might reveal that these conditions induce changes in basic eye movement measures such as fixation duration and saccade amplitude. Indeed, van Diepen et al. found that compared to free viewing, the masking conditions produced longer fixation durations at each mask-onset interval. Similarly, Rayner et al. also found that the masking conditions produced longer fixation durations compared to a free viewing control condition (for a model of fixation durations that reproduces this pattern, see Nuthmann, Smith, Engbert, & Henderson, 2010). The effect of masking on fixation duration might reflect impaired processing of fixated material. However, both van Diepen et al. (1999) and Rayner et al. (2009) acknowledged that the lengthening of fixation durations might also be related to the sudden display change that occurs at the onset of the mask. 
Prior research has shown that a sudden display change that occurs when the eye is fixated is expected to induce an oculomotor reflex known as saccadic inhibition (Pannasch, Schulz, & Velichkovsky, 2010; Reingold & Stampe, 2000, 2002, 2004; see also Henderson & Pierce, 2008; Henderson & Smith, 2009). Due to saccadic inhibition in response to the display change, a large proportion of saccades that were scheduled to occur after the display change are likely to be delayed or cancelled. This reduction in the likelihood of fixations terminating in the time window around the saccadic inhibition effect should result in a bimodal distribution of fixation durations (see Reingold & Stampe, 2000, for a discussion of the effect of saccadic inhibition on the distribution of fixation durations). Such an effect could explain the finding of longer mean fixation durations in masking conditions than in a condition where no masking occurs. Hence, in the context of the mask-onset delay paradigm, it is of critical importance to compare eye movement measures in masking conditions with a condition that controls for the effect of a sudden display change that occurs during fixation. Such a control condition should induce saccadic inhibition while allowing scene processing to proceed, thereby exposing any changes in eye movements associated with disruption to scene processing. 
With regard to the effect of scene masking on saccade amplitude, the study by Rayner et al. (2009) and those by van Diepen et al. (1995, 1999) produced conflicting patterns of findings. Specifically, van Diepen et al. found that subjects tended to make longer saccades in the masking conditions compared to the condition in which no masking occurred. In contrast, Rayner et al. found that saccadic amplitude was reduced in the masking condition. We hypothesize that these differences are due to the areas that were masked in the two studies. In the studies by van Diepen et al., only the foveated area was masked, and they suggested that the lengthening of saccade amplitudes might reflect a tendency for viewers to try to direct saccades to areas outside of the masked area. In contrast, in the study by Rayner et al., the mask obscured the entire scene display, which (as Rayner et al. acknowledged) might have interfered with the planning of saccades to targets outside of the central visual field, resulting in shorter saccades on average (for a discussion of central and peripheral masking and saccadic planning, see van Diepen & d'Ydewalle, 2003). 
To summarize, prior research using the mask-onset delay paradigm to study the encoding of scene information has produced a conflicting pattern of results. This is likely to be related to a number of methodological differences between studies. Specifically, the prior research by van Diepen et al. and by Rayner et al. involved differences in the areas of the visual field that were masked, differences in the complexity of the scene stimuli used, and differences in the viewing tasks. In the present experiments, we sought to resolve these differences by contrasting the central and peripheral masking conditions within the same experimental paradigm. Given that both Rayner et al. and van Diepen et al. used a scene search task, in Experiment 1 we applied the mask-onset delay paradigm to a scene search task, while in Experiment 2 we employed a scene memory task similar to the one that was used by Rayner et al. (2009). 
Experiment 1
In Experiment 1, subjects viewed natural full-color scenes while searching for a target object in the scene. During viewing, either the central visual field or the entire scene was masked a certain interval after fixation onset (50 or 100 ms). In addition, we included a control for the effects of saccadic inhibition that is expected to follow from the display change at mask onset. In these control conditions, the luminance of either the central visual field or the entire scene was increased following the mask-onset interval. The display changes in these conditions were expected to induce a strong saccadic inhibition effect without interrupting the extraction of visual information from the scene, thus providing a baseline with which to compare the effect of masking on eye movement behavior. 
Methods
Subjects
Nine undergraduate students at the University of California, San Diego participated for credit as part of an introductory psychology course. All subjects had normal or corrected-to-normal vision and were naive with respect to the purpose of the experiment. 
Apparatus
Eye movements were measured with an SR Research EyeLink 1000 system with high spatial resolution and a sampling rate of 1000 Hz. Viewing was binocular, but only the right eye was monitored. Following calibration, gaze position error was less than 0.5°. The stimuli were presented on a 19-inch monitor with a refresh rate of 150 Hz and a screen resolution of 1024 × 768 pixels (38° × 28.5°). Subjects were seated 65 cm from the display and a chin rest with a head support was used to minimize head movement. Both experiments were implemented in SR Research Experiment Builder that allowed precise timing control over the display changes. Fixation onsets were detected online by the Eyelink 1000, and after a prespecified interval (see Materials and design section for description of the display change conditions), the display was updated (mean fixation-onset to mask-onset intervals were 50 ms and 100 ms; standard deviation of mask-onset intervals = 0.5 ms). 
Materials and design
Stimuli were color images drawn from the Corel Image Database. We selected 184 images such that they contained a unique object that the subject would search for (e.g., a telephone in a room). All images were 1024 × 683 pixels (38° × 25°) and were centered on the screen horizontally and vertically over a black background. 
For each image, two alternate versions were generated using Matlab. The “Scramble” version was generated by dividing the image into 8 × 8 pixel squares and randomly shuffling these squares (following Rayner et al., 2009). This shuffling process effectively obliterated scene information; while some simple texture information might remain, the organization of the elements of the scene was removed. In order to create conditions that would control for the effect of saccadic inhibition, we created a “Bright” version of each image by increasing the gamma value such that the image was significantly brighter. These altered versions of the images were used to create four display change conditions. In each condition, a certain interval after the onset of each eye fixation, the original scene image was replaced by one of the alternate versions. In the full screen change conditions, the original scene was replaced with either the scrambled version of the scene (“Full Screen Scramble”) or the brighter version of the scene (“Full Screen Bright”). In the central display change conditions, a circular area centered on the fixation point (diameter = 200 pixels = 7.42° of visual angle) was replaced by either the scrambled version of the scene (“Central Scramble”) or the bright version of the scene (“Central Bright”). The area outside of the masking circle was unchanged and continued to display the original scene information. The original scene image was presented when the subject initiated a saccadic eye movement (see Figure 1 for a schematic diagram of the mask-onset manipulation). Finally, in a fifth condition, subjects viewed the original scene freely and no display changes occurred (“Free Viewing”). The size of the central mask was chosen such that in the Central Scramble condition, visual information in the range of foveal vision (central few degrees) was completely abolished by the mask, while peripheral visual information remained available. 
Figure 1
 
Schematic diagram depicting the Full Screen Scramble and the Central Scramble conditions employed during scene viewing in Experiment 1. The red dot represents an eye fixation, and the red dot with an arrow represents a saccade. The Central Bright and Full Screen Bright conditions were analogous, but a version of the scene with increased gamma was used instead of the scrambled version. The display change manipulations were identical in Experiment 2, though three mask-onset delays were used (50 ms, 75 ms, and 100 ms).
Figure 1
 
Schematic diagram depicting the Full Screen Scramble and the Central Scramble conditions employed during scene viewing in Experiment 1. The red dot represents an eye fixation, and the red dot with an arrow represents a saccade. The Central Bright and Full Screen Bright conditions were analogous, but a version of the scene with increased gamma was used instead of the scrambled version. The display change manipulations were identical in Experiment 2, though three mask-onset delays were used (50 ms, 75 ms, and 100 ms).
Within each of the four display change conditions, we manipulated the interval following fixation onset at which the change occurred (50 ms or 100 ms). The display change condition and masking onset interval remained constant within each trial. Crossing display change condition and mask-onset intervals yielded 9 different conditions (four display change conditions × two mask onset intervals + the Free Viewing condition). The order of conditions was randomized across trials, and the assignment of stimuli to conditions was counterbalanced across subjects in a Latin square design. 
Procedure
A 9-point calibration procedure was performed at the beginning of each experiment, followed by a 9-point calibration accuracy test. Calibration was repeated if any point was in error by more than 1° or if the average error for all points was greater than 0.5°. Subjects were instructed that on each trial they would be presented with the name of the search target (e.g., light switch, telephone, woman, etc.) and they would have 20 s to locate that target object in the scene that followed. Having read the target name, the subject pressed a button on a response pad and the scene containing the target object was presented on the screen. Subjects were instructed that they should find the target object as quickly as possible and that when they had located the target, they should fixate their gaze on it and press a button on the response pad. If the subject had not responded within 20 s, the trial terminated automatically. The search task proceeded in 6 blocks of 30 trials, preceded by four practice trials that were disregarded for the purposes of data analysis. Between blocks, the subject was given the opportunity to take a short break and recalibration of the eye tracker was carried out if necessary. The entire procedure lasted approximately 45 min. 
Results
The results of Experiment 1 are presented in two sections. In the first section, we contrasted task performance for each of the display change conditions with performance in the Free Viewing condition. Of specific interest was to determine whether individual display change conditions and mask-onset intervals produced impairments in scene search. In the second section, we examined the effect of the different display change conditions and mask-onset intervals on eye movements. Accordingly, we examined fixation duration and saccade amplitude, and we also conducted a fixation cluster analysis in order to reveal any differences in the spatial distribution of eye movements across conditions. 
Task performance
Task performance was examined along with three behavioral variables that were derived to reflect search efficiency. For each display change condition at each mask-onset interval, and for the Free Viewing condition, we computed search accuracy as the proportion of trials in which subjects made a response (i.e., declared that they had located the search target). We also computed the mean response latency for these responses as well as the latency for subjects to first fixate the target object. Figure 2 shows the mean values for each variable by condition. 
Figure 2
 
Measures of search performance in Experiment 1: (a) proportion of trials in which subjects made a response, (b) mean latency for responses, and (c) mean latency to first fixate the search target. Error bars represent the standard error of the mean.
Figure 2
 
Measures of search performance in Experiment 1: (a) proportion of trials in which subjects made a response, (b) mean latency for responses, and (c) mean latency to first fixate the search target. Error bars represent the standard error of the mean.
We compared performance measures for each display change condition at each mask-onset interval against the Free Viewing condition in a paired t-test. We found that compared to the Free Viewing condition, in the Full Screen Scramble condition accuracy was significantly lower in the 50-ms mask-onset interval (t(8) = 2.78, p < 0.05), and there was a trend toward this effect in the 100-ms mask-onset interval (t(8) = 2.14, p = 0.06). Response latency was also significantly longer in the 50-ms mask-onset interval (t(8) = 3.82, p < 0.01). For the Central Scramble condition, accuracy was significantly lower in the 50-ms mask-onset interval (t(8) = 2.45, p < 0.05) and response latency was also longer in this condition (t(8) = 3.54, p < 0.01). No other comparisons for accuracy and response latency were significant (all ts < 1.93, all ps > 0.09). Based on these two variables, it appears that in the Full Screen Scramble condition, search performance was impaired at both the 50-ms mask-onset interval and the 100-ms mask-onset interval, and in the Central Scramble condition, search performance was impaired at the 50-ms mask-onset interval. However, when considering the latency to first fixate the target object (Figure 2c), we found that subjects were not slower to arrive at the target in the Central Scramble condition compared to the Free Viewing condition (both ts < 1.71, ps > 0.12). In contrast, in the Full Screen Scramble condition, subjects were slower to fixate the target object in both the 50-ms mask-onset interval (t(8) = 5.64, p < 0.001) and in the 100-ms mask-onset interval (t(8) = 2.91, p < 0.05). This suggests that in the Central Scramble condition, the reduction in performance at the 50-ms mask-onset interval was related to difficulty in positively identifying the target object but not necessarily difficulty in locating it within the scene, while in the Full Screen Scramble conditions both aspects of the search were impaired. 
Eye movement behavior
In our analyses of eye movements, we considered all fixations and saccades that occurred after the onset of the search display and that ended prior to either the response or the end of the 20-s search period. We computed the mean fixation duration and mean saccade amplitude for each display change condition and each mask-onset interval and for the Free Viewing condition. Each of the display change conditions was tested against the Free Viewing condition via a paired t-test, and we also conducted a repeated measures ANOVA with the specific aim of testing for any systematic effect of mask-onset interval on the eye movement measures. This resulted in a 4 × 2 ANOVA crossing Display Change Condition and Mask Onset (50 ms, 100 ms). We also plotted the distribution of fixation durations (bin size = 50 ms) and saccade amplitudes (bin size = 1 degree) in each condition separately for each mask-onset interval. Finally, in order to characterize the spatial distribution of eye movements as a function of Display Change Condition and Mask Onset in each viewing task, we conducted a cluster analysis of the sequence of fixations in each trial. This analysis was designed to detect changes in the spatial distribution of eye movements that might indicate a tendency for subjects to adopt a viewing strategy that mediates the restriction imposed by the mask (e.g., directing multiple fixations to a particular area). The method of cluster analysis was adapted from work by Nodine, Kundel, Toto, and Krupinski (1992). In this analysis, the sequence of fixations in each trial was converted into a sequence of clusters according to the following algorithm: Starting with the second fixation in the trial, each fixation was considered as the first fixation of a new cluster if the location of that fixation exceeded 2.5 degrees of visual angle from the location of the prior cluster; otherwise, it was considered a component of the prior cluster. Cluster location was defined as the mean xy coordinates of all the component fixations of that cluster. Having derived the sequence of clusters for a given trial, we computed the number of clusters, the mean cluster duration, as well as the likelihood of a cluster being an immediate “revisit” of a prior cluster location. A cluster was considered a revisit cluster if its location fell within 2.5 degrees of visual angle of the cluster two prior (i.e., n − 2). This measure was included in order to capture any tendency for subjects to alleviate the restriction of the mask by revisiting areas that had already been fixated. These variables were subjected to the same analyses as were fixation duration and saccade amplitude. The analyses of fixation duration, saccade amplitude, and fixation clustering are described in turn. 
Fixation duration
As seen in Figure 3a, fixation durations were much longer in the Full Screen Scramble condition compared to the other display change conditions and the Free Viewing condition. It is also clear that fixation durations in the other display change conditions (Central Scramble, Central Bright, Full Screen Bright) were quite similar to the Free Viewing condition, though they were slightly but significantly longer (all ts > 3.17, all ps < 0.05, except for the Full Screen Bright condition at the 100-ms mask onset, t(12) = 1.35, p = 0.21). Also apparent in Figure 3 is a strong interaction between Display Change condition and Mask Onset, where fixation durations in the Full Screen Scramble condition differed as a function of Mask Onset, while this factor had virtually no effect in the other Display Change conditions. This was confirmed by a highly significant two-way interaction (F(3,24) = 26.80, MSE = 3.97 × 104, p < 0.001) between Display Change condition and Mask Onset. To further examine this interaction, we conducted paired t-tests between the mask-onset intervals for each display change condition, and confirming what is clearly visible in Figure 3a, we found that fixation durations were longer for the 50-ms mask onset than the 100-ms mask onset for the Full Screen Scramble condition (t(8) = 9.30, p < 0.001), but the difference between mask-onset intervals did not approach significance for any of the other masking conditions (all ts < 1). 
Figure 3
 
(a) Mean fixation duration and (b) mean saccade amplitude during scene search in Experiment 1. Error bars represent the standard error of the mean. Distributions of fixation duration as a function of display change condition for the (c) 50-ms and (e) 100-ms mask-onset intervals. Distributions of saccade amplitudes as a function of display change condition for the (d) 50-ms and (f) 100-ms mask-onset intervals.
Figure 3
 
(a) Mean fixation duration and (b) mean saccade amplitude during scene search in Experiment 1. Error bars represent the standard error of the mean. Distributions of fixation duration as a function of display change condition for the (c) 50-ms and (e) 100-ms mask-onset intervals. Distributions of saccade amplitudes as a function of display change condition for the (d) 50-ms and (f) 100-ms mask-onset intervals.
To further investigate the effect of mask-onset interval on mean fixation duration, we examined the distributions of fixation duration at each mask-onset interval. Inspection of these distributions (Figures 3c, 3e) revealed that compared to the Free Viewing condition, the display change conditions exhibited a pronounced reduction in the proportion of fixations around the center of the distribution, resulting in bimodality. This is likely to reflect a saccadic inhibition effect associated with mask onset that begins ∼100 ms following the onset of the display change. Interestingly, for both mask-onset intervals, the Central Scramble, Central Bright, and Full Screen Bright conditions have largely overlapping distributions, while the Full Screen Scramble condition appears to differ. Specifically, the Full Screen Scramble condition has a greater proportion of fixation durations in the part of the distribution following the saccadic inhibition effect. Either this might reflect a stronger saccadic inhibition effect in this condition or else the underlying distribution of fixation durations in this condition is shifted to the right compared to the other masking conditions. Further research is required to explore this, but it appears that this shift in the distribution of fixation durations for the Full Screen Scramble condition gives rise to the dramatic lengthening of mean fixation durations in that condition compared to the other display change conditions. In addition, by contrasting the distributions across mask-onset intervals, it is clear that the saccadic inhibition effect impacts upon a different (later) point along the distribution as mask-onset interval increases. Accordingly, it can be inferred that a different proportion of the underlying fixation durations were affected at each mask-onset interval, which complicates the interpretation of any differences in mean fixation duration as a function of mask-onset interval. 
Saccade amplitude
As seen in Figure 3b, saccade amplitude in the Central Scramble condition was greater than in the Free Viewing condition at both mask-onset intervals (both ts > 4.02, both ps < 0.01). These findings are consistent with the suggestion put forward by van Diepen et al. (1995, 1999) that in the presence of a foveal mask, viewers produce larger saccades, possibly in order to direct the next fixation outside of the masked area. Saccade amplitudes were significantly shorter than Free Viewing in the Full Screen Scramble condition at both mask-onset intervals (both ts > 2.73, both ps < 0.05). In contrast, saccade amplitude in the Central Bright and Full Screen Bright conditions did not differ from the Free Viewing condition at either mask-onset interval (all ts < 1.55, all ps > 0.16). The 4 × 2 ANOVA crossing Display Change condition and Mask Onset revealed significant main effects of Display Change condition (F(3,24) = 33.44, MSE = 0.36, p < 0.001) and Mask Onset (F(1,8) = 11.90, MSE = 0.052, p < 0.01). To further explore the effect of Mask Onset, we conducted t-tests that compared mask-onset intervals for each display change condition. This revealed that saccade amplitude was significantly longer in the 50-ms mask onset than in the 100 ms for the Central Scramble condition (t(8) = 3.77, p < 0.01) but that the mask-onset intervals did not differ for any other display change condition (all ps > 0.13). Examination of the distributions of saccade amplitude (Figures 3d and 3f) reveals that short saccades (1–2 degrees) were more prevalent in the Full Screen Scramble condition than in the Free Viewing condition and were less prevalent in the Central Scramble condition than in the Free Viewing condition. Finally, the two Bright conditions produced distributions of saccade amplitudes that were very similar to the Free Viewing condition. In summary, the display change conditions that involved scrambling of scene information had strong effects on saccade amplitudes that might reflect shifts in subjects' viewing strategies in response to the presence of the mask. 
Cluster analysis
In order to examine the spatial distribution of eye movements as a function of display change condition, we conducted a cluster analysis. For each trial, we computed the number of clusters, mean cluster duration, and proportion of clusters that were immediate revisits (Figures 4a, 4b, and 4c, respectively). As seen in Figure 4, fixation clustering changed markedly as a function of display change condition. In particular, there was an increase in the number of clusters at the 50-ms mask onset in both the Central Scramble condition (t(8) = 4.24, p < 0.01) and Full Screen Scramble condition (t(8) = 2.02, p = 0.07) relative to the Free Viewing condition (for all other paired tests, t < 1.72, p > 0.12). This finding parallels the increase in search time that was observed in these two conditions. Cluster duration increased significantly in the Full Screen Scramble condition (both ts > 3.97, both ps < 0.01), but for the other Display Change conditions, cluster duration was quite similar to the Free Viewing condition, which is consistent with the large increase in fixation duration in the Full Screen Scramble condition. Finally, the likelihood of cluster revisits in the Central Scramble condition was increased at both mask-onset intervals (both ts > 2.33, ps < 0.05), while in the other conditions it did not differ significantly from Free Viewing (all ts < 1.94, ps > 0.08). This might reflect a tendency for subjects to mediate the restriction imposed by the central mask by revisiting previously viewed locations. Specifically, in the Central Scramble condition, subjects were not impaired in first fixating the search target, but they were slower to make their final response, a difference that might be accounted for by the increase in immediate revisit clusters. 
Figure 4
 
Fixation cluster analysis for Experiment 1. We computed (a) the number of clusters, (b) the mean cluster duration, and (c) the proportion of clusters that were an immediate revisit to a previous cluster location as a function of display change condition and mask-onset interval. Error bars represent the standard error of the mean.
Figure 4
 
Fixation cluster analysis for Experiment 1. We computed (a) the number of clusters, (b) the mean cluster duration, and (c) the proportion of clusters that were an immediate revisit to a previous cluster location as a function of display change condition and mask-onset interval. Error bars represent the standard error of the mean.
Discussion
Experiment 1 confirmed that for the mask-onset delay paradigm during scene viewing, the area of the scene that is masked is a critical variable influencing the effect of masking on performance and eye movements. In particular, consistent with the findings of Rayner et al. (2009), we found that when the entire scene is masked, search performance was impaired even at a relatively long mask-onset interval (100 ms). This deficit was accompanied by a large increase in fixation duration (over and above the increase associated with saccadic inhibition) and a reduction in saccade amplitude. In contrast, when central vision was masked but peripheral information was spared, search was only impaired at the short mask-onset interval (50 ms). Note that van Diepen et al. (1995, 1999) found that task performance was unimpaired when a foveal mask appeared 45 ms after fixation onset; however, their use of line-drawn scenes might account for this difference. Importantly, the impairment that we observed in the Central Scramble condition appeared to be limited to certain aspects of search performance. Specifically, and consistent with the findings of van Diepen et al., the masking of central vision 50 ms after fixation onset did not produce difficulties in locating and fixating the target object; however, it appeared that the process of identifying the target and responding was delayed in this condition. We also found differences in the pattern of eye movements in the central masking condition compared to the full screen masking condition. For one, unlike the full screen masking condition, in the central masking condition there was no evidence that fixation duration was affected over and above the changes that are induced by the display change associated with mask onset. Second, subjects tended to make longer saccades and more frequent revisits to previously viewed locations in the central masking condition. This is likely due to difficulty with target identification in that condition. 
Experiment 2
In Experiment 2, we investigated the role of viewing task on the pattern of performance and eye movements in this paradigm. Accordingly, we applied a scene memory task that was used by Rayner et al. (2009), in which subjects viewed the scenes in preparation for a recognition memory test. Viewing was restricted according the mask-onset delay manipulations employed in Experiment 1, though we included an additional intermediate mask-onset interval (75 ms) in order to help describe any systematic effect of mask-onset interval on both performance and eye movements. 
Methods
Subjects
Thirteen undergraduate students at the University of California, San Diego participated for credit as part of an introductory psychology course. None of the subjects in Experiment 1 took part in Experiment 2. All subjects had normal or corrected-to-normal vision and were naive with respect to the purpose of the experiment. 
Apparatus
The physical setup for Experiment 2 was identical to Experiment 1
Materials and design
Stimuli were color images drawn from the Corel Image Database. The images used in Experiment 2 were 520 outdoor landscape scenes. This set was relatively homogenous in terms of content, and scenes were selected so as not to contain salient details (e.g., a signpost with writing on it) that might result in ceiling performance on the recognition test that followed the viewing of the scenes. 
For each image, the “Scramble” version and “Bright” versions were generated in Matlab according to the method used in Experiment 1. There were three mask-onset intervals (50 ms, 75 ms, and 100 ms), and the display change condition and masking onset interval remained constant within each trial. Crossing display change condition and mask-onset intervals yielded 13 different conditions (four display change conditions × three mask onset delays + the Free Viewing condition). The order of conditions was randomized across trials, and the assignment of stimuli to conditions was counterbalanced across subjects in a Latin square design. 
Procedure
A 9-point calibration procedure was performed at the beginning of each experiment, followed by a 9-point calibration accuracy test. Calibration was repeated if any point was in error by more than 1° or if the average error for all points was greater than 0.5°. Subjects carried out a scene memory task that proceeded in two phases: an encoding phase in which they viewed a large set of scene images, followed by a recognition phase where they made two-alternative forced-choice recognition decisions. Subjects were instructed that during the encoding phase they would view a sequence of images, each for 6 s, and that they were to examine each image in preparation for a recognition memory test that would take place at the end of the sequence. They were also told that certain changes would be made to the display while they viewed the images, but that regardless of these changes they should do their best to remember the images. The subjects viewed four practice scenes in order to familiarize them with each of the display change conditions; these trials were disregarded for the purpose of data analysis. The encoding phase proceeded in five blocks of 52 images, and after viewing each image for 6 s, the subject pressed a button on a response pad to proceed to the next image. Between blocks, subjects were given the opportunity to take a short break and recalibration was carried out if necessary. During the recognition phase, they were presented with two images and they had to decide which of the two images had been presented in the prior encoding phase. One image was presented in the upper half of the screen and the other image was presented on the bottom half of the screen (images were resized to 575 × 383 pixels in order to fit together on screen). The “old” image from the encoding phase appeared in the top location on half of the recognition trials, as did the “new” image, which was drawn from the set of 260 scene images that did not appear in the encoding phase (half of the original set of 520 images). Subjects indicated their response by pressing either the upper or lower button on a response pad, and they received auditory feedback as to whether or not their response was correct. The 260 recognition trials were carried out in five blocks of 52 trials. The entire procedure lasted about 1 h. 
Results
As in Experiment 1, we examined the effect of the different masking conditions and mask-onset intervals on task performance and eye movement measures. Accordingly, we compared recognition memory performance for scenes viewed in each of the display change conditions with those viewed in the Free Viewing condition. We also examined fixation duration and saccade amplitude, and we conducted a fixation cluster analysis in order to reveal any differences in the spatial distribution of eye movements across conditions. 
Task performance
Scene memory performance was examined by deriving mean accuracy during the recognition phase, for each of the display change conditions and each of the mask-onset intervals, as well as for the Free Viewing condition. Each display change condition was compared with the Free Viewing condition in a paired t-test. As seen in Figure 5, recognition accuracy was only reduced in the Full Screen Scramble condition for the 50-ms mask onset (t(12) = 4.29, p < 0.01) and for the 75-ms mask onset (t(12) = 2.41, p < 0.05). No other masking condition differed significantly from Free Viewing (all ts < 1.66, all ps > 0.12). Given that in Experiment 1, search performance in the Central Scramble condition at the 50-ms mask-onset interval was impaired, it was somewhat surprising to find that scene memory was unaffected in this condition. In order to qualify this null finding, we ran a separate control experiment in order to rule out the possibility that the information in central vision might not be required for normal recognition memory performance in this task. Accordingly, 12 subjects participated in a version of the experiment with two conditions: Free Viewing and Central Scramble with “zero delay” where the central mask in the Central Scramble condition was continuously centered at the point of gaze. This revealed a significant decrement in recognition accuracy in the Central Scramble Zero Delay condition (Free Viewing = 0.71, Central Scramble Zero Delay = 0.64, t(11) = 3.89, p < 0.01), confirming that the visual information obtained from central vision was indeed contributing to memory performance in the central masking conditions. Taken together with the finding that performance was not impaired in the Central Scramble 50-ms mask-onset interval condition, we can infer that the information in central vision that is relevant to recognition memory performance could be extracted within 50 ms of exposure within an eye fixation. 
Figure 5
 
Mean recognition accuracy as a function of display change condition and mask-onset interval for Experiment 2. Error bars represent the standard error of the mean.
Figure 5
 
Mean recognition accuracy as a function of display change condition and mask-onset interval for Experiment 2. Error bars represent the standard error of the mean.
Eye movement behavior
We only considered fixations and saccades during the encoding phase that began after the onset of the scene and that ended within the 6-s viewing period for each scene. We computed the mean fixation duration and mean saccade amplitude for each display change condition and each mask-onset interval and for the Free Viewing condition. Each of the display change conditions was tested against the Free Viewing condition via a paired t-test, and we conducted a 4 × 3 ANOVA crossing Display Change Condition (Central Scramble, Central Bright, Full Screen Scramble, Full Screen Bright) and Mask Onset (50 ms, 75 ms, 100 ms) in order to test for any systematic effect of mask-onset interval on the eye movement measures. We also plotted the distribution of fixation durations (bin size = 50 ms) and saccade amplitudes (bin size = 1 degree) in each condition separately for each mask-onset interval. As in Experiment 1, we also conducted a cluster analysis of the sequence of fixations in each trial (for the cluster analysis method, see the Results section of Experiment 1). The analyses of fixation duration, saccade amplitude, and fixation clustering for each experiment are described in turn. 
Fixation duration
Mean fixation durations for each condition are plotted in Figure 6a. Examination of the means reveals that fixation durations were significantly longer in the display change conditions than the Free Viewing condition at all mask-onset intervals (all ts > 2.23, all ps < 0.05) with the exception of the Central Scramble condition at the 50-ms mask onset, in which fixation durations were numerically longer than the Free Viewing condition, though this effect was not statistically significant (t(12) = 1.71, p = 0.11). The 4 × 3 ANOVA crossing Display Change Condition and Mask Onset revealed an effect of Display Change Condition (F(3,36) = 15.72, MSE = 1.20 × 104, p < 0.01), which was predominantly driven by the large increase in fixation duration in the Full Screen Scramble condition. In addition, there was a significant effect of Mask Onset (F(2,24) = 5.15, MSE = 1.93 × 103, p < 0.05), though planned comparisons revealed that fixation durations differed significantly as a function of mask-onset interval only in the Central Scramble condition (50 ms < 100 ms, t(12) = 2.94, p < 0.05; 75 ms < 100 ms, t(12) = 3.00, p <0.05) and in the Central Bright condition (50 ms < 75 ms, t(12) = 2.44, p < 0.05). Hence, in contrast to the findings of Rayner et al. (2009) where fixation durations tended to decrease with mask onset interval, we observed rather weak effects of this factor that tended to run in the opposite direction. Inspection of the distributions of fixation durations (Figures 6c, 6e, and 6g) reveals the presence of a strong saccadic inhibition effect time-locked to the display change. To reiterate the caveat that was mentioned in the context of the results of Experiment 1, in this paradigm the interpretation of differences in mean fixation durations across mask-onset intervals is complicated by the saccadic inhibition effect. Nevertheless, by comparing against a control for the effect of saccadic inhibition (the Bright display change conditions), we replicated the qualitative pattern observed in Experiment 1 where the Full Screen Scramble condition produced distributions of fixation durations that were biased toward longer fixation durations. The reason for this change in the distribution of fixation durations in the Full Screen Scramble condition is not immediately clear, but it might be related to the impairment in performance in that condition and might reflect a shift in the underlying distribution of fixation durations or even an interaction with the saccadic inhibition effect. 
Figure 6
 
(a) Mean fixation duration and (b) mean saccade amplitude during the encoding phase for Experiment 2. Error bars represent the standard error of the mean. Distributions of fixation duration as a function of display change condition for the (c) 50-ms, (e) 75-ms, and (g) 100-ms mask-onset intervals. Distributions of saccade amplitudes as a function of display change condition for the (d) 50-ms, (f) 75-ms, and (h) 100-ms mask-onset intervals.
Figure 6
 
(a) Mean fixation duration and (b) mean saccade amplitude during the encoding phase for Experiment 2. Error bars represent the standard error of the mean. Distributions of fixation duration as a function of display change condition for the (c) 50-ms, (e) 75-ms, and (g) 100-ms mask-onset intervals. Distributions of saccade amplitudes as a function of display change condition for the (d) 50-ms, (f) 75-ms, and (h) 100-ms mask-onset intervals.
Saccade amplitude
The qualitative effect of the masking manipulations on saccade amplitude was quite similar to that observed in Experiment 1. As seen in Figure 6b, saccade amplitude in the Central Scramble condition was greater than in the Free Viewing condition at each mask-onset interval (all ts > 9.90, all ps < 0.001). However, saccade amplitudes in the Central Bright condition were also slightly but significantly greater than in the Free Viewing condition (all ts > 3.21, all ps < 0.01), as was the case for the Full Screen Bright condition at the 75-ms onset (t(12) = 2.72, p < 0.05) and the 100 ms onset (t(12) = 2.72, p < 0.05), suggesting that the saccade lengthening is not restricted to the case of central masking. Finally, while saccade amplitudes in the Full Screen Scramble condition were numerically smaller than in the Free Viewing condition, these differences did not reach statistical significance (all ts < 1.89, all ps > 0.08). The 4 × 3 ANOVA crossing Display Change condition and Mask Onset revealed a main effect of Display Change condition (F(3,36) = 42.1, MSE = 2.70, p < 0.001), but neither the effect of Mask Onset nor the two-way interaction was significant. Examination of the distributions of saccade amplitude (Figures 6d, 6f, and 6h) reveals that short saccades (1–2 degrees) were more prevalent in the Full Screen Scramble condition than in the Free Viewing condition and were less prevalent in the Central Scramble condition than in the Free Viewing condition. Finally, the two Bright conditions produced distributions of saccade amplitudes that were much more similar to the Free Viewing condition. 
Cluster analysis
For each trial, we computed the number of clusters, mean cluster duration, and the proportion of clusters that were immediate revisits (Figures 7a7c). As seen in Figure 7, it is apparent that the effect of masking on fixation clustering produced a somewhat different pattern than that observed in Experiment 1. In particular, in Experiment 1 the masking manipulations tended to increase the number of fixation clusters, while in Experiment 2 there was a large reduction in the number of clusters in the Full Screen Scramble conditions (all ts > 6.88, ps < 0.001) and for the Central Scramble conditions the number of clusters was not significantly affected (all ts < 1.59, ps > 0.13). However, it should be noted that differences in the number of clusters measure across the experiments must be interpreted with caution because the available viewing period differed between the two tasks (scenes were viewed for up to 20 s in Experiment 1 vs. 6 s in Experiment 2). Examination of mean cluster durations (Figure 7b) reveals an overall pattern that was similar to Experiment 1: Cluster duration was greatly increased in the Full Screen Scramble condition at each mask-onset interval compared to the Free Viewing condition (all ts > 3.11, ps < 0.001) and was unaffected in the Central Scramble conditions (all ts < 1.17, ps > 0.27). Finally, the proportion of clusters that were immediate revisits (Figure 7c) shows a different pattern of findings than Experiment 1. While in Experiment 1 cluster revisits were reduced in the Full Screen Scramble condition and increased in the Central Scramble condition, here there was a slight increase in the proportion of clusters that were revisits in the Full Screen Scramble condition at the 50-ms mask onset compared to Free Viewing (t(12) = 3.08, p < 0.05), and there was a non-significant reduction in cluster revisits in the Central Scramble condition. To test for the overall effect of mask-onset interval, we conducted 4 × 3 ANOVAs crossing Display Change condition and Mask Onset. This revealed significant two-way interactions for number of clusters (F(6,72) = 5.55, MSE = 0.44, p < 0.001) and cluster duration (F(6,72) = 3.35, MSE = 1.79 × 103, p < 0.01). To qualify the interactions, we conducted one-way ANOVAs testing for the effect of Mask Onset within each display change condition. This revealed an effect of Mask Onset only in the Central Scramble condition for number of clusters (F(2,24) = 11.10, MSE = 0.61, p < 0.001) and cluster duration (F(2,24) = 5.07, MSE = 9.38 × 102, p < 0.05). These effects reflect a tendency for the number of clusters, and cluster duration, to more closely resemble the Free Viewing condition as the mask-onset interval increased (particularly for the 100-ms mask onset). 
Figure 7
 
Fixation cluster analysis for Experiment 2. We computed (a) the number of clusters, (b) the mean cluster duration, and (c) the proportion of clusters that were an immediate revisit to a previous cluster location as a function of display change condition and mask-onset interval. Error bars represent the standard error of the mean.
Figure 7
 
Fixation cluster analysis for Experiment 2. We computed (a) the number of clusters, (b) the mean cluster duration, and (c) the proportion of clusters that were an immediate revisit to a previous cluster location as a function of display change condition and mask-onset interval. Error bars represent the standard error of the mean.
Discussion
The results of Experiment 2 largely confirmed the findings of Experiment 1 but also indicated that the effect of masking in the mask-onset delay paradigm depends, in part, on the viewing task. In particular, and consistent with the findings of Rayner et al. (2009), masking of the entire scene had a detrimental effect on scene memory performance even when it occurred relatively late after fixation onset (75 ms), suggesting that subjects might require up to 100 ms of exposure for normal viewing in this condition. In sharp contrast, scene memory performance was unaffected when central vision was masked as early as 50 ms into each eye fixation, indicating that the information that is required from central vision in this task can be extracted sufficiently with only 50 ms of stimulus exposure within each fixation. This finding is in line with the findings of van Diepen et al. (1995, 1999) who found that performance was unaffected by a foveal mask that appeared very early within fixation. The effects of central and full screen masking on fixation duration and saccade amplitude in Experiment 2 were very similar to that observed in Experiment 1. In particular, the Full Screen Scramble manipulation had the largest effect on fixation duration, inducing a lengthening of fixation durations over and above that which might be caused by a display change alone (e.g., the Full Screen Bright condition). Subjects also tended to make shorter saccades in the Full Screen Scramble condition, suggesting that they experienced some difficulty in processing information about upcoming saccadic targets in the periphery. These effects might be related to the reduction in performance observed in this condition for both tasks. Our analysis of fixation clustering in Experiment 2 suggested that the changes in the spatial distribution of eye movements in response to the masking manipulations are at least partly task-dependent. For example, in Experiment 1, the Central Scramble condition had more clusters and more immediate cluster revisits than the Free Viewing condition, while the opposite pattern tended to hold in Experiment 2. It is likely that the greater requirement for processing of information in central vision in the search task in Experiment 1 (e.g., to positively identify the search target) led to the changes in fixation clustering in that task. For the memory task in Experiment 2, we suggest that scene encoding could take place “normally” even when central vision was masked 50 ms following fixation onset, and hence, fixation clustering was largely unaffected in this masking condition. 
General discussion
In the present experiments, we investigated the effects of restricting the availability of visual information within eye fixations during scene viewing in a search task and a memory task. To limit the availability of scene information within individual eye fixations, we used a mask-onset delay paradigm in which scene information in the central visual field, or over the entire display, was masked after a brief interval following the onset of each eye fixation. Our findings clarify two discrepant sets of results in the prior literature. In particular, we found that when central as well as peripheral information was masked (Full Screen Scramble conditions), both performance and eye movement behavior were affected even at relatively long mask-onset intervals, consistent with the findings of Rayner et al. (2009). In contrast, and consistent with the general pattern of findings reported by van Diepen et al. (1995, 1999), we found that the masking of central vision produces subtle impairments in performance that appear at earlier mask-onset intervals and depend on the specific requirements of the viewing task. Accordingly, our findings suggest that peripheral information is particularly important for normal scene viewing and that restricting the availability of this information via masking can produce impairments in scene viewing across different viewing tasks. 
In addition to the effects of masking on task performance, the present experiments provide insight regarding the effect of masking in the mask-onset delay paradigm on patterns of eye movement behavior. Consistent with prior studies employing this paradigm, in the present study, we observed that the masking manipulations tended to cause an overall increase in fixation duration. By examining the distributions of fixation durations, we found that this was related to a strong saccadic inhibition effect (Pannasch et al., 2010; Reingold & Stampe, 2000, 2002, 2004; see also Henderson & Pierce, 2008; Henderson & Smith, 2009) that occurred as a result of the display change. Specifically, the display change caused a dramatic reduction in the likelihood of a saccade occurring ∼100 ms after the change, resulting in an increase in the proportion of fixations that terminate later. To better understand the effect of saccadic inhibition, we compared the masking conditions with control conditions in which a display change occurred, but it did not obscure scene information (Central Bright and Full Screen Bright conditions). These control conditions exhibited a strong saccadic inhibition effect, indicating that display changes lengthen fixation durations in this paradigm even when they do not interfere with scene processing. Furthermore, we found that the distribution of fixation durations in the Full Screen Scramble condition was quite different than that of the other display change conditions. Specifically, this condition had a larger proportion of fixations in the latter portion of the distribution. This reason for this difference was not immediately clear; it might be the result of a stronger saccadic inhibition effect in the Full Screen Scramble condition, but it also might be related to processing difficulties that are present in that condition and not in the other display change conditions. For example, the masking of peripheral visual information in this condition might interrupt with the planning of the upcoming saccade, leading to longer fixation durations (for a similar argument, see van Diepen & d'Ydewalle, 2003). Further research is required to examine the way in which peripheral masking might interrupt visual processing and saccade planning. 
The effect of saccadic inhibition also makes it hard to interpret the effect of mask-onset interval on mean fixation duration. Prior studies using the mask-onset delay paradigm in scene viewing (van Diepen et al., 1995, 1999; Rayner et al., 2009) found a relationship between fixation duration and mask-onset interval. In particular, these studies reported that fixation durations were longer for the shorter mask-onset intervals compared to the longer intervals. In the present study, we only observed this relationship in the Full Screen Scramble condition in Experiment 1 where fixation durations at the 50-ms mask-onset interval were much longer on average than at the 100-ms interval. In contrast, in Experiment 2, the only significant effects of mask-onset interval on mean fixation duration ran in the opposite direction. The reasons for these mixed results are not immediately clear. However, we speculate that the relationship between mean fixation duration and mask-onset interval might depend on the interaction between the saccadic inhibition effect caused by the display change at mask-onset and the underlying distribution of fixation durations. Specifically, the proportion of saccades (and hence fixation durations) that are affected by the inhibition depends on the underlying probability of a saccade occurring during the period after a display change where saccadic inhibition takes effect. Display changes that induce an inhibition pattern near the mode of the distribution of fixation durations will affect more fixations than those that occur during the beginning or tail of the distribution. Hence, saccadic inhibition should be expected to affect mean fixation duration differently at different mask-onset intervals, but the exact way in which it is affected is difficult to predict in advance. In light of this, we suggest that differences in mean fixation duration as a function of mask-onset interval should be interpreted with caution. 
We also observed changes in saccade amplitude as a function of display change condition that replicate prior research. Consistent with the findings of van Diepen et al. (van Diepen et al., 1995; van Diepen & d'Ydewalle, 2003; van Diepen et al., 1999), we found that when central vision was obscured by a mask, subjects tend to make longer saccades. As suggested by van Diepen et al. (1999), this might reflect a tendency to direct upcoming fixations to the area outside of the masked area in central vision. Consistent with the findings of Rayner et al. (2009), we found that when the entire scene was masked, viewers tended to make shorter saccades. By examining the distribution of saccade amplitudes, we found that the proportion of small amplitude saccades was higher in the Full Screen Scramble condition than in the other display change conditions. We speculate that the masking of visual information in the periphery interferes with the planning of upcoming saccadic targets, which results in an increase in small amplitude saccades (for a similar argument, see van Diepen & d'Ydewalle, 2003). 
Expanding on prior research using this paradigm, we applied a fixation cluster analysis. This analysis revealed strong effects of the display change manipulation on fixation clustering and these effects were somewhat different across the viewing tasks. For the scene memory task in Experiment 2, significant changes in fixation clustering compared to Free Viewing were limited to the Full Screen Scramble condition, where we observed a reduction in the number of fixation clusters, increased cluster duration, and an increase in the tendency to revisit previous cluster locations compared to the other conditions. For the scene search task in Experiment 1, changes in fixation clustering were also apparent in the Central Scramble condition. In particular, there was an increase in the number of fixation clusters at the early (50 ms) mask-onset interval, and there was also an increase in the likelihood of revisit clusters compared to Free Viewing. These differences seem to parallel the differences we observed between scene memorization and scene search in the effect of masking on task performance. For example, while the masking of central vision at the 50-ms mask-onset interval had no effect on task performance in the scene memory task, it did appear to introduce some difficulty in scene search. Specifically, subjects took longer to make their response and were more likely to fail to respond within the search period, suggesting that masking of visual information in the central visual field 50 ms after fixation onset resulted in reduced search efficiency. 
Thus, the search task in Experiment 1 seemed to be more sensitive to availability of visual information in central vision than the memory task in Experiment 2, both in terms of performance and fixation clustering. This is likely to be related to differences in the requirements of the scene memory task and the scene search task, respectively. In particular, it is known that the activation of scene gist can occur following a very brief exposure (e.g., Castelhano & Henderson, 2008). Thus, in the present study, it may be the case that in the Central Scramble condition where visual information in central vision was present for only 50 ms but peripheral information remains visible throughout the remainder of the fixation, scene gist could be extracted sufficiently to support performance in the scene recognition task. In contrast, performance in the search task in Experiment 1 appears to rely more heavily on the availability of information in central vision. We suggest that when central vision is masked shortly after fixation onset, the processing of scene layout and gist can proceed sufficiently in order to provide the general location of the search target within the scene. However, the removal of scene information in central vision might interfere with the process of identifying the search target, which is likely to rely more heavily on the detail information supplied by central vision. For example, in the Central Scramble condition at the early mask-onset interval (50 ms), subjects might have some difficulty positively identifying the search target or matching its appearance with the cued target word, and as a result, they tend to revisit the target location more often and take longer to make an overt response. 
The results of the present experiments construct a bridge between research in the reading domain demonstrating that word information can be encoded with as little as 50–60 ms of exposure within a fixation (Ishida & Ikeda, 1989; Liversedge et al., 2004; Rayner et al., 1981, 2006, 2003; Slowiaczek & Rayner, 1987) and the conflicting body of findings in the literature on scene perception that have produced estimates on the time required to encode scene information as long as 150 ms (Biederman, 1981; Biederman et al., 1982; Castelhano & Henderson, 2007, 2008; Fei-Fei et al., 2007; Greene & Oliva, 2009; Intraub, 1980; Joubert et al., 2007; Loschky et al., 2007; Oliva & Schyns, 1997; Potter, 1975; Rayner et al., 2009; Schyns & Oliva, 1994; Thorpe et al., 1996; van Diepen et al., 1995, 1999; Van Rullen & Thorpe, 2001; Võ & Henderson, 2010; Võ & Schneider, 2010). In particular, we found a dissociation between the Central Scramble and Full Screen Scramble conditions, and we also found that the minimum exposure duration for normal performance depends partly on the requirements of the viewing task. When the entire scene was masked, we observed strong disruptions to task performance even when it occurred at relatively long intervals after fixation onset (100 ms in Experiment 1; 75 ms in Experiment 2). The deficits associated with full screen masking are likely to do with interruption in the extraction of information from the peripheral visual field that is required for the planning of upcoming saccades. This information might take longer to encode, or as was suggested by van Diepen and d'Ydewalle (2003), peripheral information might be used later rather than earlier in a fixation, and hence, the masking of this information might produce impairments even when it occurs at relatively late intervals after fixation onset. The effect of central masking depended on the requirements of the viewing task: While memory performance (Experiment 2) was unaffected by a central mask with a mask-onset interval of 50 ms, this manipulation interfered with certain aspects of performance during visual search (Experiment 1). 
Finally, our findings clearly demonstrate that the disruption to eye movement patterns in the mask-onset delay paradigm is likely mediated by multiple mechanisms. Specifically, in addition to the removal of scene information and the possible visual degradation due to masking, the visible display change inherent in the mask-onset delay paradigm produces strong saccadic inhibition. Moreover, the masking manipulations might also interact with visual attention. For example, the appearance of the mask in central vision introduces strong visual edges in the display, and the central mask might also be treated as a new object in the visual field (i.e., a visual attention distractor effect). As discussed earlier, one contribution of the present research involves the introduction of a control display change condition (i.e., Central Bright and Full Screen Bright conditions). The present findings suggest that with respect to saccadic inhibition such a control display change is an effective method for providing a proper baseline for the masking conditions. Furthermore, the Central Bright control condition we employed also produced a highly salient visual onset with strong visual edges, and consequently, this condition might help to control for a possible visual attention distractor effect. Thus, we would argue that in addition to reconciling conflicting findings from studies employing the mask-onset delay paradigm, the present study introduces an important extension of this paradigm in order to explore the mechanisms underlying the observed disruption to eye movements. 
Acknowledgments
We thank Randy Tran and Karolina Hryciuk for assistance in collecting subject data and Tim Smith and Mary Hayhoe for their comments on an earlier draft of the paper. This research was supported by Grant HD26765 from the US National Institutes of Health, by the Atkinson Fund, and by the Natural Sciences and Engineering Research Council of Canada. 
Commercial relationships: none. 
Corresponding author: Mackenzie Glaholt. 
Email: mackenzie.glaholt@gmail.com. 
Address: Department of Psychology, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA. 
Footnote
Footnotes
1   1In the present paper, we use the term “mask” as it was used in prior research on the “mask-onset delay paradigm” and in particular to describe manipulations where a segment of the scene or else the entire scene is replaced with scrambled scene information (e.g., Rayner et al., 2009) or a noise mask (e.g., van Diepen et al., 1995). With regard to the masking manipulations in the present experiments, we make no a priori assumptions about the objects, segments, or spatial regions of the scene that are being processed in a given fixation or the nature of the information, or the processing, that is disrupted by the presentation of the scrambled scene information. While in the present study scene information is removed by a scrambling of the stimulus image, there are additional possible manipulations that could be explored in future work, such as by blanking the display or by replacing the scene with another scene. Note that Rayner et al. (2006) found no difference in reading studies between conditions in which the fixated words disappeared (i.e., blanking the display) and conditions where words were replaced by a masking pattern.
References
Biederman I. (1981). On the semantics of a glance at a scene. In Kubovy M. Pomerantz J. R. (Eds.), Perceptual organization (pp. 213–253). Hillsdale, NJ: Lawrence Erlbaum Associates.
Biederman I. Mezzanotte R. J. Rabinowitz J. C. (1982). Scene perception: Detecting and judging objects undergoing relational violations. Cognitive Psychology, 14, 143–177. [CrossRef] [PubMed]
Castelhano M. S. Henderson J. M. (2007). Initial scene representations facilitate eye movement guidance in visual search. Journal of Experimental Psychology: Human Perception and Performance, 33, 753–763. [CrossRef] [PubMed]
Castelhano M. S. Henderson J. M. (2008). The influence of color on the perception of scene gist. Journal of Experimental Psychology: Human Perception and Performance, 34, 660–675. [CrossRef] [PubMed]
Fei-Fei L. Iyer A. Koch C. Perona P. (2007). What do we perceive in a glance of a real-world scene? Journal of Vision, 7(1):10, 1–29, http://www.journalofvision.org/content/7/1/10, doi:10.1167/7.1.10. [PubMed] [Article] [CrossRef] [PubMed]
Greene M. R. Oliva A. (2009). The briefest of glances. Psychological Science, 20, 464. [CrossRef] [PubMed]
Henderson J. M. Ferreira F. (2004). Scene perception for psycholinguists. In Henderson J. M. Ferreira F. (Eds.), The interface of language, vision, and action (pp. 1–58). New York: Psychology Press.
Henderson J. M. Hollingworth A. (1999). High-level scene perception. Annual Review of Psychology, 50, 243–271. [CrossRef] [PubMed]
Henderson J. M. Pierce G. L. (2008). Eye movements during scene viewing: Evidence for mixed control of fixation durations. Psychonomic Bulletin & Review, 15, 566. [CrossRef] [PubMed]
Henderson J. M. Smith T. J. (2009). How are eye fixation durations controlled during scene viewing? Evidence from a scene onset delay paradigm. Visual Cognition, 17, 1055–1082. [CrossRef]
Intraub H. (1980). Presentation rate and the representation of briefly glimpsed pictures in memory. Journal of Experimental Psychology: Human Learning and Memory, 6, 1–12. [CrossRef] [PubMed]
Ishida T. Ikeda M. (1989). Temporal properties of information extraction in reading studied by a text-mask replacement technique. Journal of the Optical Society of America A, 6, 1624–1632. [CrossRef]
Joubert O. R. Rousselet G. A. Fize D. Fabre-Thorpe M. (2007). Processing scene context: Fast categorization and object interference. Vision Research, 47, 3286–3297. [CrossRef] [PubMed]
Liversedge S. P. Rayner K. White S. J. Vergilino-Perez D. Findlay J. M. Kentridge R. W. (2004). Eye movements when reading disappearing text: Is there a gap effect in reading? Vision Research, 44, 1013–1024. [CrossRef] [PubMed]
Loschky L. C. Sethi A. Simons D. J. Pydimarri T. N. Ochs D. Corbeille J. L. (2007). The importance of information localization in scene gist recognition. Journal of Experimental Psychology: Human Perception and Performance, 33, 1431–1450. [CrossRef] [PubMed]
Nodine C. F. Kundel H. L. Toto L. C. Krupinski E. A. (1992). Recording and analyzing eye-position data using a microcomputer workstation. Behaviour Research Methods, Instruments, & Computers, 24, 475–485. [CrossRef]
Nuthmann A. Smith T. J. Engbert R. Henderson J. M. (2010). CRISP: A computational model of fixation durations in scene viewing. Psychological Review, 177, 382–405. [CrossRef]
Oliva A. Schyns P. G. (1997). Coarse blobs or fine edges evidence that information diagnosticity changes the perception of complex visual stimuli. Cognitive Psychology, 34, 72–107. [CrossRef] [PubMed]
Pannasch S. Schulz J. Velichkovsky B. (2010). Explaining visual fixation durations in scene perception: Are there indeed two distinct groups of fixations [Abstract]. Journal of Vision, 10(7):138, 138a, http://www.journalofvision.org/content/10/7/138, doi:10.1167/10.7.138. [CrossRef]
Potter M. C. (1975). Meaning in visual search. Science, 187, 965–966. [CrossRef] [PubMed]
Rayner K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124, 372–422. [CrossRef] [PubMed]
Rayner K. (2009). Eye movements and attention in reading, scene perception, and visual search. Quarterly Journal of Experimental Psychology, 62, 1457–1506. [CrossRef]
Rayner K. Inhoff A. W. Morrison R. E. Slowiaczek M. L. Bertera J. H. (1981). Masking of foveal and parafoveal vision during eye fixations in reading. Journal of Experimental Psychology: Human Perception and Performance, 7, 167–179. [CrossRef] [PubMed]
Rayner K. Liversedge S. P. White S. J. (2006). Eye movements when reading disappearing text: The importance of the word to the right of fixation. Vision Research, 46, 310–323. [CrossRef] [PubMed]
Rayner K. Liversedge S. P. White S. J. Vergilino-Perez D. (2003). Reading disappearing text. Psychological Science, 14, 385–388. [CrossRef] [PubMed]
Rayner K. Pollatsek A. (1981). Eye movement control during reading: Evidence for direct control. Quarterly Journal of Experimental Psychology, 33A, 351–373. [CrossRef]
Rayner K. Smith T. J. Malcolm G. L. Henderson J. M. (2009). Eye movements and visual encoding during scene perception. Psychological Science, 20, 6–10. [CrossRef] [PubMed]
Reingold E. M. Stampe D. M. (2000). Saccadic inhibition and gaze contingent research paradigms. In Kennedy A. Radach R. Heller D. Pynte J. (Eds.), Reading as a perceptual process (pp. 119–145). Amsterdam, The Netherlands: Elsevier.
Reingold E. M. Stampe D. M. (2002). Saccadic inhibition in voluntary and reflexive saccades. Journal of Cognitive Neuroscience, 14, 371–388. [CrossRef] [PubMed]
Reingold E. M. Stampe D. M. (2004). Saccadic inhibition in reading. Journal of Experimental Psychology: Human Perception and Performance, 30, 194–211. [CrossRef] [PubMed]
Schyns P. G. Oliva A. (1994). From blobs to boundary edges: Evidence for time- and spatial-scale-dependent scene recognition. Psychological Science, 5, 195. [CrossRef]
Slowiaczek M. L. Rayner K. (1987). Sequential masking during eye fixations in reading. Bulletin of the Psychonomic Society, 25, 175–178. [CrossRef]
Thorpe S. Fize D. Marlot C. (1996). Speed of processing in the human visual system. Nature, 381, 520–522. [CrossRef] [PubMed]
van Diepen P. M. J. De Graef P. d'Ydewalle G. (1995). Chronometry of foveal information extraction during scene perception. Studies in Visual Information Processing, 6, 349–362.
van Diepen P. M. J. d'Ydewalle G. (2003). Early peripheral and foveal processing in fixations during scene perception. Visual Cognition, 10, 79–100. [CrossRef]
van Diepen P. M. J. Ruelens L. d'Ydewalle G. (1999). Brief foveal masking during scene perception. Acta Psychologica, 101, 91–103. [CrossRef] [PubMed]
Van Rullen R. Thorpe S. J. (2001). The time course of visual processing: From early perception to decision-making. Journal of Cognitive Neuroscience, 13, 454–461. [CrossRef] [PubMed]
Võ M. L. H. Henderson J. M. (2010). The time course of initial scene processing for eye movement guidance in natural scene search. Journal of Vision, 10(3):14, 1–13, http://www.journalofvision.org/content/10/3/14, doi:10.1167/10.3.14. [PubMed] [Article] [CrossRef] [PubMed]
Võ M. L. H. Schneider W. X. (2010). A glimpse is not a glimpse: Differential processing of flashed scene previews leads to differential target search benefits. Visual Cognition, 18, 171–200. [CrossRef]
Figure 1
 
Schematic diagram depicting the Full Screen Scramble and the Central Scramble conditions employed during scene viewing in Experiment 1. The red dot represents an eye fixation, and the red dot with an arrow represents a saccade. The Central Bright and Full Screen Bright conditions were analogous, but a version of the scene with increased gamma was used instead of the scrambled version. The display change manipulations were identical in Experiment 2, though three mask-onset delays were used (50 ms, 75 ms, and 100 ms).
Figure 1
 
Schematic diagram depicting the Full Screen Scramble and the Central Scramble conditions employed during scene viewing in Experiment 1. The red dot represents an eye fixation, and the red dot with an arrow represents a saccade. The Central Bright and Full Screen Bright conditions were analogous, but a version of the scene with increased gamma was used instead of the scrambled version. The display change manipulations were identical in Experiment 2, though three mask-onset delays were used (50 ms, 75 ms, and 100 ms).
Figure 2
 
Measures of search performance in Experiment 1: (a) proportion of trials in which subjects made a response, (b) mean latency for responses, and (c) mean latency to first fixate the search target. Error bars represent the standard error of the mean.
Figure 2
 
Measures of search performance in Experiment 1: (a) proportion of trials in which subjects made a response, (b) mean latency for responses, and (c) mean latency to first fixate the search target. Error bars represent the standard error of the mean.
Figure 3
 
(a) Mean fixation duration and (b) mean saccade amplitude during scene search in Experiment 1. Error bars represent the standard error of the mean. Distributions of fixation duration as a function of display change condition for the (c) 50-ms and (e) 100-ms mask-onset intervals. Distributions of saccade amplitudes as a function of display change condition for the (d) 50-ms and (f) 100-ms mask-onset intervals.
Figure 3
 
(a) Mean fixation duration and (b) mean saccade amplitude during scene search in Experiment 1. Error bars represent the standard error of the mean. Distributions of fixation duration as a function of display change condition for the (c) 50-ms and (e) 100-ms mask-onset intervals. Distributions of saccade amplitudes as a function of display change condition for the (d) 50-ms and (f) 100-ms mask-onset intervals.
Figure 4
 
Fixation cluster analysis for Experiment 1. We computed (a) the number of clusters, (b) the mean cluster duration, and (c) the proportion of clusters that were an immediate revisit to a previous cluster location as a function of display change condition and mask-onset interval. Error bars represent the standard error of the mean.
Figure 4
 
Fixation cluster analysis for Experiment 1. We computed (a) the number of clusters, (b) the mean cluster duration, and (c) the proportion of clusters that were an immediate revisit to a previous cluster location as a function of display change condition and mask-onset interval. Error bars represent the standard error of the mean.
Figure 5
 
Mean recognition accuracy as a function of display change condition and mask-onset interval for Experiment 2. Error bars represent the standard error of the mean.
Figure 5
 
Mean recognition accuracy as a function of display change condition and mask-onset interval for Experiment 2. Error bars represent the standard error of the mean.
Figure 6
 
(a) Mean fixation duration and (b) mean saccade amplitude during the encoding phase for Experiment 2. Error bars represent the standard error of the mean. Distributions of fixation duration as a function of display change condition for the (c) 50-ms, (e) 75-ms, and (g) 100-ms mask-onset intervals. Distributions of saccade amplitudes as a function of display change condition for the (d) 50-ms, (f) 75-ms, and (h) 100-ms mask-onset intervals.
Figure 6
 
(a) Mean fixation duration and (b) mean saccade amplitude during the encoding phase for Experiment 2. Error bars represent the standard error of the mean. Distributions of fixation duration as a function of display change condition for the (c) 50-ms, (e) 75-ms, and (g) 100-ms mask-onset intervals. Distributions of saccade amplitudes as a function of display change condition for the (d) 50-ms, (f) 75-ms, and (h) 100-ms mask-onset intervals.
Figure 7
 
Fixation cluster analysis for Experiment 2. We computed (a) the number of clusters, (b) the mean cluster duration, and (c) the proportion of clusters that were an immediate revisit to a previous cluster location as a function of display change condition and mask-onset interval. Error bars represent the standard error of the mean.
Figure 7
 
Fixation cluster analysis for Experiment 2. We computed (a) the number of clusters, (b) the mean cluster duration, and (c) the proportion of clusters that were an immediate revisit to a previous cluster location as a function of display change condition and mask-onset interval. Error bars represent the standard error of the mean.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×