Open Access
Article  |   October 2019
Been there, seen that, done that: Modification of visual exploration across repeated exposures
Author Affiliations & Notes
  • Footnotes
    *  OCLD and GK contributed equally to this work.
Journal of Vision October 2019, Vol.19, 2. doi:https://doi.org/10.1167/19.12.2
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Oryah C. Lancry-Dayan, Ganit Kupershmidt, Yoni Pertzov; Been there, seen that, done that: Modification of visual exploration across repeated exposures. Journal of Vision 2019;19(12):2. doi: https://doi.org/10.1167/19.12.2.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

The underlying factors that determine gaze position are a central topic in visual cognitive research. Traditionally, studies emphasized the interaction between the low-level properties of an image and gaze position. Later studies examined the influence of the semantic properties of an image. These studies explored gaze behavior during a single presentation, thus ignoring the impact of familiarity. Sparse evidence suggested that across repetitive exposures, gaze exploration attenuates but the correlation between gaze position and the low-level features of the image remains stable. However, these studies neglected two fundamental issues: (a) repeated scenes are displayed later in the testing session, such that exploration attenuation could be a result of lethargy, and (b) even if these effects are related to familiarity, are they based on a verbatim familiarity with the image, or on high-level familiarity with the gist of the scene? We investigated these issues by exposing participants to a sequence of images, some of them repeated across blocks. We found fewer, longer fixations as familiarity increased, along with shorter saccades and decreased gaze allocation towards semantically meaningful regions. These effects could not be ascribed to tonic fatigue, since they did not manifest for images that changed across blocks. Moreover, there was no attenuation of gaze behavior when participants observed a flipped version of the familiar images, suggesting that gist familiarity is not sufficient for eliciting these effects. These findings contribute to the literature on memory-guided gaze behavior and provide novel insights into the mechanism underlying the visual exploration of familiar environments.

Introduction
Whether it is the Homo sapiens migrating to a new living area or the modern man getting to a new work place, human beings have been required to learn new environments through their entire evolutional course. The experience of a new place becoming familiar is rather known to each one of us; while at the beginning we are highly attentive and thoroughly explore where we are, after a while we seem to pay less visual attention to our surroundings. However, although the impact of familiarity on visual exploration is easily identified in subjective experience, the literature on this topic has been slow to emerge. 
Numerous studies in the field of gaze exploration have examined the effects of low-level features of visual input on gaze position (e.g., Itti, 2005; Itti, Koch, & Niebur, 1998; Parkhurst & Niebur, 2004). These studies have tended to address the effect of bottom-up processes on gaze position through the concept of saliency maps (Koch & Ullman, 1987). Specifically, it is argued that gaze location is determined by a “winner takes all” process based on the local conspicuousness of different visual properties (such as luminance, orientation, and color) of the visual input. While this line of research has contributed to a better understanding of the ways in which gaze behavior is modified by image properties, it also curtailed the investigation of other important factors. For example, most of these studies intentionally used nonrecurring stimuli to control for possible memory effects. Although this design made it possible to investigate the relationship between low-level features of the image and gaze behavior, it also masked possible interactions between gaze and memory. 
However, in the last two decades, a handful of studies have started to investigate the concept of memory-guided gaze behavior. In general, these studies can be divided into two main types. The first examines basic gaze behavior by looking at the distributions of fixations and saccades independently of the content of the image at hand (i.e., content-free). The findings indicate that fewer fixations are made when participants observe familiar as compared to unfamiliar stimuli (Althoff & Cohen, 1999; Peth, Kim, & Gamer, 2013; Ryan, Hannula, & Cohen, 2007). Preceding research has shown how similar effects emerge over repetitive exposures when images gradually become familiar (Bradley, Houbova, Miccoli, Costa, & Lang, 2011; Heisz & Shore, 2008; Kaspar & König, 2011a). For example, Kaspar and König (2011a) administered a free-view task in which participants saw the same set of images five times. Across repeated presentations of the images, the duration of fixations increased significantly whereas the length of saccades and their frequency decreased significantly. These results are consistent with the general supposition that sampling behavior attenuates in the case of familiar stimuli, and is manifested in fewer and longer fixations as well as fewer and shorter saccades. Unlike the content-free approach in the first line of studies, the second type of studies has examined the relationship between gaze position and low-level properties of an image (Foulsham & Underwood, 2008; Harding & Bloj, 2010; Kaspar & König, 2011b; Underwood, Foulsham, & Humphrey, 2009). These studies integrated the classical concept of saliency maps into the developing field of memory-guided gaze behavior. For example, Kaspar and König (2011b) examined whether fixations towards salient low-level features of an image changed across five exposures to the same images. Their examination of 22 different low-level features showed that the correlation between low-level features and fixation selection did not change across repetitive exposures to the same visual input. Thus, increasing familiarity with the visual input did not change the impact of salient low-level features on gaze position. 
While these studies continued the traditional line of saliency maps, in a recent study Henderson and Hayes (2017) suggested revising the theoretical saliency map framework. They created meaning maps (based on the judgments of a large sample of subjects) that represented the semantic richness of regions of a scene. They reported that the saliency and meaning maps were correlated, and showed that the meaning maps alone accounted for the unique variance in gaze behavior (i.e., when controlling for saliency, meaning still explained a significant amount of variation in gaze behavior. However, this was not the case for saliency when controlling for the effect of meaning). This raises interesting questions as to the ways in which gaze toward semantically meaningful parts of an image is modified by memory. In contrast to the first cohort of studies, this line of research refers to gaze towards specific regions of the image based on their content. Thus, we consider this line of research as content-based gaze behavior. To the best of our knowledge, no study has examined how repetitive exposures to the same image modify high-level content-based gaze behavior. 
An overview of the current literature sheds light on two fundamental issues related to gaze behavior under repetitive exposures remain unresolved. First, although previous studies attributed the reported effects to increasing familiarity, the repeated exposures were also tested later on in the experimental session. Thus, it may have been the case that the reported effects (e.g., decreased exploration) were due to increased fatigue or lethargy rather than familiarity. This option seriously endangers the integrity of previous conclusions as it suggests that the reported effects are actually due to increasing rates of fatigue, and not familiarity. As this alternative explanation relates to the overall accumulation of fatigue across the experiment, we would refer to it as a tonic fatigue. Second, even if the effects can be shown to be related to memory, it is not clear which aspect of familiarity elicited them. Specifically, theories differentiate between verbatim and gist representations in memory (Brainerd & Reyna, 2002; Reyna & Brainerd, 1995). A verbatim representation corresponds to an exact memory of the perceptual input, whereas gist representations define memories of the general idea of this input. Thus, in this study we investigated whether the presumed memory-guided effects are an outcome of familiarity with the specific visual details of the image and their configuration (verbatim familiarity) or familiarity with the general gist of the image (gist familiarity). Differentiating between these two aspects of familiarity was impossible in previous studies, as they used visually identical images across all exposures. 
To better understand these memory-guided effects, the three experiments reported here were aimed to (a) determine whether familiarity is a valid explanation for the previous reported effects such that tonic fatigue could be rejected as an alternative explanation, (b) provide a comprehensive interpretation of the interplay between gaze behavior and semantically meaningful features of the image across repetitive exposures, and (c) shed light on the aspect of familiarity that elicits memory-guided gaze modulation. 
Experiment 1
Method
Participants
The sample was composed of 30 university students (7 men, 23 women) ranging in age from 20 to 30 (mean age: 22.9). All participants signed a written consent form and participated in exchange for course credit or payment. All participants had normal or corrected to normal vision. 
Stimuli
A total of 40 images were displayed. The stimuli consisted of color photographs of scenes, half of which were taken from Xu, Jiang, Wang, Kankanhalli, and Zhao (2014). Each image in the Xu et al. (2014) set was segmented into three feature levels: pixel (color, intensity, and orientation), object (i.e., size), and semantics (i.e., faces and text). The object level refers to properties of the specific object in the image (e.g., the area or convexity of the object in the image). The semantic attributes are related to the meaning of the object in the context of image. The pixel-level feature maps were constructed by the algorithm developed by Itti et al. (1998), and the segmentation of the object level was carried out with a graph cuts algorithm (based on McGuinness & O'Connor, 2010). The segmentation of the images into the different objects adhered to several guidelines that provided a baseline for a more objective labeling process. Specifically, small objects or blurry ones were not segmented, and neither were objects that covered a large area (these objects were regarded as background). In addition, objects of the same type that were clustered in the same spatial position were grouped as one object. Paid subjects labeled the objects and semantic attributes. The semantic attributes covered four main categories: (a) directly related to humans (e.g., face, emotion, touched), (b) objects with implied motion in the image, (c) related to other (nonvisual) senses (e.g., objects with a recognized smell), and (d) designed to attract attention or for interaction with humans (i.e., text). A more detailed explanation on the segmentation procedure can be found in Xu et al. (2014). 
Apparatus
The stimuli were displayed on a 23 in. Syncmaster monitor, with a 120 Hz refresh rate, and a 1024 × 768 screen resolution. Monocular gaze position was tracked at 1000 Hz with an Eyelink 1000+ (SR Research Ltd., Mississauga, Ontario, Canada). Participants' heads were stabilized using a chinrest situated 60 cm from the screen. 
Procedure
At the beginning of the experiment, participants were told that they were going to see a set of images and that they should examine them carefully because a memory test would be conducted afterwards. The experiment was composed of four learning blocks, in which participants observed the same set of 40 images in random order in each block. Afterward, a recognition block was administered and contained the 40 original images (i.e., images that were displayed in the previous blocks) together with 40 novel images. The original and novel images were displayed in random order and participants were asked to indicate by a key press whether the image was old or new (see Figure 1). Before each block, the participants went through the standard nine-point calibration and validation procedure provided by the manufacturer with the eye tracker. 
Figure 1
 
Experimental design for Experiment 1. Participants were repetitively exposed to a set of 40 images across four blocks. The order of images in each block was randomized. On the recognition block, participants saw a set of 80 images and were asked to indicate by a key press whether each image was old or new.
Figure 1
 
Experimental design for Experiment 1. Participants were repetitively exposed to a set of 40 images across four blocks. The order of images in each block was randomized. On the recognition block, participants saw a set of 80 images and were asked to indicate by a key press whether each image was old or new.
Data exclusion criteria
Since the main purpose of the current research was to investigate gaze behavior during the exploration of scenes, we excluded anomalous trials based on the number of fixations. We chose this measure as our exclusion criterion, because it indicated whether enough data on participants' scanning behavior were available. Accordingly, on each trial we calculated the z scores based on the number of fixations across all participants' trials and excluded trials that were 2.5 standard deviations above or below the mean. This procedure resulted in excluding 3.7% of all trials. Moreover, we also eliminated fixations that were positioned outside the screen boundaries, resulting in the removal of 1.3% of the recorded fixations. 
Results
Accuracy
Participants' performance on the recognition test was at ceiling with 97.5% correct responses. Therefore, we did not analyze how memory performance was related to exploration characteristics. 
Basic gaze behavior analysis (content-free)
Based on previous studies (Bradley et al., 2011; Heisz & Shore, 2008; Kaspar & König, 2011a) we explored how the fixation rate (i.e., mean number of fixations per second), mean fixation duration, and saccade amplitude changed as familiarity increased. For each participant, we calculated the average of each of these measures per block and conducted a one-way ANOVA with block as the within-subject factor (1/2/3/4). Consistent with previous studies, we found several changes in eye movements across repetitive exposures, indicating an attenuation of gaze exploration (see Figure 2). Specifically, over blocks, the duration of fixations increased, F(3, 87) = 2.823, p = 0.043, Display Formula\(\def\upalpha{\unicode[Times]{x3B1}}\)\(\def\upbeta{\unicode[Times]{x3B2}}\)\(\def\upgamma{\unicode[Times]{x3B3}}\)\(\def\updelta{\unicode[Times]{x3B4}}\)\(\def\upvarepsilon{\unicode[Times]{x3B5}}\)\(\def\upzeta{\unicode[Times]{x3B6}}\)\(\def\upeta{\unicode[Times]{x3B7}}\)\(\def\uptheta{\unicode[Times]{x3B8}}\)\(\def\upiota{\unicode[Times]{x3B9}}\)\(\def\upkappa{\unicode[Times]{x3BA}}\)\(\def\uplambda{\unicode[Times]{x3BB}}\)\(\def\upmu{\unicode[Times]{x3BC}}\)\(\def\upnu{\unicode[Times]{x3BD}}\)\(\def\upxi{\unicode[Times]{x3BE}}\)\(\def\upomicron{\unicode[Times]{x3BF}}\)\(\def\uppi{\unicode[Times]{x3C0}}\)\(\def\uprho{\unicode[Times]{x3C1}}\)\(\def\upsigma{\unicode[Times]{x3C3}}\)\(\def\uptau{\unicode[Times]{x3C4}}\)\(\def\upupsilon{\unicode[Times]{x3C5}}\)\(\def\upphi{\unicode[Times]{x3C6}}\)\(\def\upchi{\unicode[Times]{x3C7}}\)\(\def\uppsy{\unicode[Times]{x3C8}}\)\(\def\upomega{\unicode[Times]{x3C9}}\)\(\def\bialpha{\boldsymbol{\alpha}}\)\(\def\bibeta{\boldsymbol{\beta}}\)\(\def\bigamma{\boldsymbol{\gamma}}\)\(\def\bidelta{\boldsymbol{\delta}}\)\(\def\bivarepsilon{\boldsymbol{\varepsilon}}\)\(\def\bizeta{\boldsymbol{\zeta}}\)\(\def\bieta{\boldsymbol{\eta}}\)\(\def\bitheta{\boldsymbol{\theta}}\)\(\def\biiota{\boldsymbol{\iota}}\)\(\def\bikappa{\boldsymbol{\kappa}}\)\(\def\bilambda{\boldsymbol{\lambda}}\)\(\def\bimu{\boldsymbol{\mu}}\)\(\def\binu{\boldsymbol{\nu}}\)\(\def\bixi{\boldsymbol{\xi}}\)\(\def\biomicron{\boldsymbol{\micron}}\)\(\def\bipi{\boldsymbol{\pi}}\)\(\def\birho{\boldsymbol{\rho}}\)\(\def\bisigma{\boldsymbol{\sigma}}\)\(\def\bitau{\boldsymbol{\tau}}\)\(\def\biupsilon{\boldsymbol{\upsilon}}\)\(\def\biphi{\boldsymbol{\phi}}\)\(\def\bichi{\boldsymbol{\chi}}\)\(\def\bipsy{\boldsymbol{\psy}}\)\(\def\biomega{\boldsymbol{\omega}}\)\(\def\bupalpha{\unicode[Times]{x1D6C2}}\)\(\def\bupbeta{\unicode[Times]{x1D6C3}}\)\(\def\bupgamma{\unicode[Times]{x1D6C4}}\)\(\def\bupdelta{\unicode[Times]{x1D6C5}}\)\(\def\bupepsilon{\unicode[Times]{x1D6C6}}\)\(\def\bupvarepsilon{\unicode[Times]{x1D6DC}}\)\(\def\bupzeta{\unicode[Times]{x1D6C7}}\)\(\def\bupeta{\unicode[Times]{x1D6C8}}\)\(\def\buptheta{\unicode[Times]{x1D6C9}}\)\(\def\bupiota{\unicode[Times]{x1D6CA}}\)\(\def\bupkappa{\unicode[Times]{x1D6CB}}\)\(\def\buplambda{\unicode[Times]{x1D6CC}}\)\(\def\bupmu{\unicode[Times]{x1D6CD}}\)\(\def\bupnu{\unicode[Times]{x1D6CE}}\)\(\def\bupxi{\unicode[Times]{x1D6CF}}\)\(\def\bupomicron{\unicode[Times]{x1D6D0}}\)\(\def\buppi{\unicode[Times]{x1D6D1}}\)\(\def\buprho{\unicode[Times]{x1D6D2}}\)\(\def\bupsigma{\unicode[Times]{x1D6D4}}\)\(\def\buptau{\unicode[Times]{x1D6D5}}\)\(\def\bupupsilon{\unicode[Times]{x1D6D6}}\)\(\def\bupphi{\unicode[Times]{x1D6D7}}\)\(\def\bupchi{\unicode[Times]{x1D6D8}}\)\(\def\buppsy{\unicode[Times]{x1D6D9}}\)\(\def\bupomega{\unicode[Times]{x1D6DA}}\)\(\def\bupvartheta{\unicode[Times]{x1D6DD}}\)\(\def\bGamma{\bf{\Gamma}}\)\(\def\bDelta{\bf{\Delta}}\)\(\def\bTheta{\bf{\Theta}}\)\(\def\bLambda{\bf{\Lambda}}\)\(\def\bXi{\bf{\Xi}}\)\(\def\bPi{\bf{\Pi}}\)\(\def\bSigma{\bf{\Sigma}}\)\(\def\bUpsilon{\bf{\Upsilon}}\)\(\def\bPhi{\bf{\Phi}}\)\(\def\bPsi{\bf{\Psi}}\)\(\def\bOmega{\bf{\Omega}}\)\(\def\iGamma{\unicode[Times]{x1D6E4}}\)\(\def\iDelta{\unicode[Times]{x1D6E5}}\)\(\def\iTheta{\unicode[Times]{x1D6E9}}\)\(\def\iLambda{\unicode[Times]{x1D6EC}}\)\(\def\iXi{\unicode[Times]{x1D6EF}}\)\(\def\iPi{\unicode[Times]{x1D6F1}}\)\(\def\iSigma{\unicode[Times]{x1D6F4}}\)\(\def\iUpsilon{\unicode[Times]{x1D6F6}}\)\(\def\iPhi{\unicode[Times]{x1D6F7}}\)\(\def\iPsi{\unicode[Times]{x1D6F9}}\)\(\def\iOmega{\unicode[Times]{x1D6FA}}\)\(\def\biGamma{\unicode[Times]{x1D71E}}\)\(\def\biDelta{\unicode[Times]{x1D71F}}\)\(\def\biTheta{\unicode[Times]{x1D723}}\)\(\def\biLambda{\unicode[Times]{x1D726}}\)\(\def\biXi{\unicode[Times]{x1D729}}\)\(\def\biPi{\unicode[Times]{x1D72B}}\)\(\def\biSigma{\unicode[Times]{x1D72E}}\)\(\def\biUpsilon{\unicode[Times]{x1D730}}\)\(\def\biPhi{\unicode[Times]{x1D731}}\)\(\def\biPsi{\unicode[Times]{x1D733}}\)\(\def\biOmega{\unicode[Times]{x1D734}}\)\({\eta _p^2} = 0.09\), whereas the fixation rate decreased, F(3, 87) = 14.15, p < 0.001,Display Formula\(\ {\eta _p^2} = 0.33\), as well as the saccadic amplitude, F(3, 87) = 7.526, p < 0.001,Display Formula\(\ {\eta _p^2} = 0.21\)
Figure 2
 
Summary of content-free gaze behavior in Experiment 1. Modification of fixation durations (left), fixation rate (middle), and saccade amplitude (right) across repetitive exposures (blocks 1–4). Error bars indicate ±1 SE.
Figure 2
 
Summary of content-free gaze behavior in Experiment 1. Modification of fixation durations (left), fixation rate (middle), and saccade amplitude (right) across repetitive exposures (blocks 1–4). Error bars indicate ±1 SE.
Feature analysis (content-based)
In order to explore how gaze toward different features of the image develop over time, we used the segmentation procedure suggested by Xu et al. (2014) into semantic, object, and pixel levels. As our main interest was to examine how gaze was deployed toward meaningful areas of the image, we calculated the percentage of time fixating on the semantic features during a trial. To determine whether gaze time at meaningful features of the image changed across exposures, we conducted a one-way ANOVA with block as the within-subject factor (1/2/3/4). This analysis yielded a significant effect for block, F(3, 87) = 11.94, p < 0.001, Display Formula\({\eta _p^2} = 0.\)29, indicating a decrease in the allocation of gaze towards meaningful parts of the images as familiarity increases (see Figure 3). 
Figure 3
 
Summary of content-based exploration in Experiment 1. Percentage of fixation time on semantically meaningful features (left) and percentage of fixation time on salient low-level features (right) across repetitive exposures (blocks 1–4). Error bars indicate ±1 SE.
Figure 3
 
Summary of content-based exploration in Experiment 1. Percentage of fixation time on semantically meaningful features (left) and percentage of fixation time on salient low-level features (right) across repetitive exposures (blocks 1–4). Error bars indicate ±1 SE.
As described above, visual exploration attenuated across repetitive exposures. Specifically, we found a decrease in the fixation rate and saccade amplitude over the course of the experiment. This decrease in fixation time on the semantic features of the image could arguably be a byproduct of the decrease in fixation rate and saccade amplitude. In order to examine this possibility, we calculated the decrease in fixation rate, the decrease in saccade amplitude and the decrease in the fixation time on the semantic features for each participant, by calculating the difference between the first and last learning blocks for each measure. Then, we conducted two further analyses: (a) the correlation across participants between the amount of decrease in fixation rate and the decrease in fixation time on semantic features, and (b) the correlation between participants' decrease in saccadic amplitude and fixation time on semantic features. We expected positive correlations if the decrease in fixation rate or saccade amplitude was related to the decrease in fixation time on semantic features; thus, we computed one-tailed analysis for both correlations. These analyses revealed close to zero correlations between the difference in fixation rate and the decrease in fixation time on semantic features (Display Formula\(r = - 0.04,\ p\ \gt\ 0.5\)) as well as between the decrease in saccade amplitude and decrease fixation time on semantic features (Display Formula\(r = 0.04,\ p = 0.42\)). These results suggest that the decrease in allocation of gaze towards meaningful regions of the image cannot be explained by the overall attenuation of basic gaze exploration characteristics. 
In addition, we examined the changes in gaze allocation towards salient features of the image by calculating the percentage of time participants fixated on salient low-level features. This analysis also yielded an effect for block, with a significant decrease in the proportion of fixation time on low-level properties, F(3, 87) = 4.475, p = 0.005,Display Formula\({\rm{\ }}{\eta _p^2} = 0.13\) (see Figure 3). The correlations across individuals between their decrease in fixation on low-level features and their overall decrease in gaze exploration were not significantly different from zero (fixation rate:Display Formula\(r = - 0.23,\ p\ \gt\ 0.5\); saccade amplitude: Display Formula\(r = - 0.26,\ p\ \gt\ 0.5\)). 
Experiment 2a and 2b
While Experiment 1 was consistent with previous findings examining content-free gaze behavior, it also sheds new light on content-based gaze behavior; not only did we examine the general distribution of fixations, but we also considered the semantic value of the fixated regions. This analysis showed a decline in the time spent on meaningful and visually salient features of the image as familiarity increased. 
However, it remains unclear whether these effects occurred as a result of strengthening representation in memory. Since gaze behavior across repetitive exposures is characterized by attenuated gaze exploration (i.e., longer fixations, a lower fixation rate, and shorter saccades), tonic fatigue is a reasonable confound that could explain this pattern of behavior. In order to explore this issue, in the current experiments all participants were exposed to a repeating set of images (i.e., images repeating across the four blocks) that were presented together with a novel set of images that changed on each block. Thus, replicating our previous findings for the repeating images but not for the changing ones would strongly support the claim that changes in gaze behavior could be ascribed to familiarity and were not a mere effect of tonic fatigue. 
Moreover, even if it can be shown that these effects are memory driven, they could be a result of familiarity with the specific configuration of the image (i.e., verbatim familiarity), or familiarity with the general gist (i.e., gist familiarity). To shed light on this question, we exposed participants to identical visual input during the first three blocks but showed them a different configuration of the same visual input (i.e., a mirror image flipped horizontally) in the last learning block. This design was aimed to disentangle the two aspects of familiarity: Although the participants were familiar with the gist of the image in this block, the configuration was novel for them. By implementing this difference we could examine whether the reported effects of gaze behavior were due to gist familiarity (i.e., if the exploration attenuation trend in the three blocks continued in the last one), verbatim familiarity (i.e., if gaze behavior in the last learning block resembled gaze behavior during the first presentation of the image) or a combination of both. Thus, Experiments 2a and 2b were designed to rule out the alternative tonic fatigue explanation for attenuated gaze exploration, and further probe which aspect of familiarity would elicit this attenuation. 
Method
Participants
The sample was composed of 32 university students (10 men, 22 women) ranging in age from 19 to 29 (mean age: 23.8) for Experiment 2a, and 35 university students (8 men, 27 women) ranging in age from 18 to 27 (mean age: 22.5) for Experiment 2b. All participants signed a written consent form and participated in exchange for course credit or payment. All of them had normal or corrected to normal vision. 
Stimuli
The stimuli consisted of color photographs of scenes, all taken from Xu et al. (2014). A total of 120 images were displayed in this experiment: 20 images were displayed repetitively across all blocks and 80 images were changed on each block (i.e., 20 novel images for each of the four blocks). In addition, 20 new images were used for the recognition test. 
Apparatus
The setting of Experiments 2a and 2b (i.e., the room, the eye-tracker, and the screen) were identical to Experiment 1
Procedure
The experimental procedure for Experiments 2a and 2b was similar to Experiment 1, with two main differences: (a) in each block, half of the images were novel and half were repeated across all blocks, and (b) in the last learning block, participants saw a flipped version of the repeating images. Accordingly, this modified paradigm was composed of four learning blocks, in which participants viewed a set of 40 images (20 repeating images and 20 changing images), presented in random order, in each block. In the fourth block, the repeating images were flipped horizontally. Importantly, the sole difference between Experiments 2a and 2b was the counterbalancing of the changing images. In Experiment 2a, the set of repeating images and the sets of changing images were fixed across all participants. In contrast, in Experiment 2b the sets were counterbalanced between participants. Hence, the repeating images of one participant were part of the changing images of another participant. Thus, any differences in gaze behavior in Experiment 2b could not be associated with the specific characteristics of the repeating or novel images. 
After the fourth learning block, a recognition block was administered consisting of the 20 repeating images together with 20 completely novel ones, displayed in random order (see Figure 4). At the end of the experiment, the participants were asked whether they noticed that the images had been flipped in the fourth learning block. All of the participants except two noticed the change. 
Figure 4
 
The experimental design in Experiments 2a and 2b. Participants were exposed to a set of 40 images in each of the four blocks. Half of these images were repetitively displayed across all blocks (turquoise frame), and half were novel and changed between blocks (orange frame). The images in each block appeared in random order, and were presented for 5,000 ms. Importantly, on the fourth block, the repeating images were flipped. On the recognition block, participants saw a set of 40 images and were asked to report by a key press whether each image was old or new.
Figure 4
 
The experimental design in Experiments 2a and 2b. Participants were exposed to a set of 40 images in each of the four blocks. Half of these images were repetitively displayed across all blocks (turquoise frame), and half were novel and changed between blocks (orange frame). The images in each block appeared in random order, and were presented for 5,000 ms. Importantly, on the fourth block, the repeating images were flipped. On the recognition block, participants saw a set of 40 images and were asked to report by a key press whether each image was old or new.
Data exclusion criteria
Because the images in Experiment 2a were not counterbalanced, we only examined the data for the repeating images in this experiment.1 Aside from this, the exclusion criterion was similar to Experiment 1: trials that were 2.5 standard deviations above or below the average number of fixations were excluded from analysis. The procedure resulted in the exclusion of 2.7% of the trials in Experiment 2a and 2.3% of the trials in Experiment 2b. The percentage of discarded trials did not differ significantly between the repeating and changing images (Experiment 2a: t(31) = 1.209, p = 0.243; Experiment 2b: t(34) = 0.76, p = 0.455). Moreover, we excluded fixations that were positioned outside the screen boundaries, resulting in the exclusion of 0.8% of the recorded fixations in Experiment 2a and 0.6% of the recorded fixations in Experiment 2b
Results
Accuracy
Similar to Experiment 1, participants were highly accurate on the recognition test (Experiment 2a: 97% correct; Experiment 2b: 99.5% correct), precluding meaningful comparisons between recognized and unrecognized images. 
Basic gaze behavior analysis (content-free)
As in Experiment 1, we investigated the ways in which mean fixation duration, fixation rate, and saccade amplitude changed as familiarity increased. Accordingly, we calculated the average of these measurements separately for each block and participant. Since we expected that the trend of attenuated gaze exploration would change in the last learning block (i.e., the flipped version of the repeating images was expected to lead to increased exploration), the analysis was carried out in two steps. First, we conducted ANOVAs on the first three blocks to examine whether the results of Experiment 1 could be replicated. Accordingly, in Experiment 2a we carried out a one-way ANOVA with block as the within-subject factor (1/2/3). In Experiment 2b, we added another within factor of image type (repeating or changing). The pattern of gaze behavior in both analyses was similar to Experiment 1 and previous research. In Experiment 2a, we found that the mean duration of fixations increased significantly, F(2, 62) = 3.298, p = 0.04, Display Formula\({\eta _p^2} = 0.09\), whereas the fixation rate, F(2, 62) = 4.31, p = 0.02, Display Formula\({\eta _p^2} = 0.12\), and the amplitude of the saccades decreased significantly, F(2, 62) = 7.775, p < 0.001, Display Formula\({\eta _p^2} = 0.20\), across blocks. This pattern of gaze behavior was also reflected in Experiment 2b, in which the interaction effect between block and image type was significant for the fixation duration, F(2, 68) = 8.74, p < 0.001, Display Formula\({\eta _p^2} = 0.2\), fixation rate, F(2, 68) = 8.67, p < 0.001, Display Formula\({\eta _p^2} = 0.2\), and saccade amplitude, F(2, 68) = 9.843, p < 0.001, Display Formula\({\eta _p^2} = 0.22\). To further examine the origin of this interaction effect, we decomposed the analysis into two one-way ANOVAs, one for each type of image. This analysis indicated that the effect of block was significant for the repeating images, for the fixation duration, F(2, 68) = 12.04, p < 0.001, Display Formula\({\eta _p^2} = 0.26\), fixation rate, F(2, 68) = 8.57, p < 0.001, Display Formula\({\eta _p^2} = 0.28\), and saccade amplitude, F(2, 68) = 18.71, p < 0.001, Display Formula\({\eta _p^2} = 0.35\). However, neither the difference in fixation duration, F(2, 68) = 0.291, p = 0.748, Display Formula\({\eta _p^2} = 0.01\), nor the difference in fixation rate, F(2, 68) = 0.92, p = 0.403, Display Formula\({\eta _p^2} = 0.02\), or saccade amplitude, F(2, 68) = 0.984, p = 0.379, Display Formula\({\eta _p^2} = 0.03\), were significant for the changing images. 
In the second step of the analysis, we computed planned t tests to explore the modification of gaze behavior when participants observed a flipped version of the repeating images (i.e., gaze behavior during the last learning block). We were interested in two main issues: (a) whether gaze behavior during presentation of the flipped image resembled behavior during the first observation of the unflipped image, and (b) whether the trend of attenuation in gaze exploration would continue in the viewing of the flipped image (i.e., the fixation duration would continue to increase and fixation frequency as well as saccade amplitude would continue to decrease). Thus, in each experiment, we compared the measures extracted from the first and last learning blocks, as well as the third learning block and last one (these tests were carried out once for fixation duration, once for fixation rate and once for saccade amplitude). In general, the exploration attenuation tendency across the first three blocks did not continue to the last learning block (i.e., the fixation duration decreased while the fixation rate and the saccade amplitude increased; see Figure 5). Thus, in contrast to the second and third blocks (in which gaze behavior became increasingly more distinct from the first block), in the last learning block gaze behavior reverted back toward the initial gaze patterns. 
Figure 5
 
Summary of content-free exploration metrics in Experiments 2a and 2b. Modification of fixation duration (A), fixation rate (B), and saccade amplitude (C) across repetitive exposures (blocks 1–4). Bar plots on the left depict the results of Experiment 2a across the four blocks of repeating images. In the middle, the bar plots represent the findings from Experiment 2b, separately for repeating (turquoise) and changing (novel) images (orange) across blocks. On the right, we depict the results of the Bayesian analysis on the aggregated data from both experiments. The violin plots describe the posterior distribution of the mean difference between the first block and the last learning block (light turquoise) and between the third and the last learning block (dark turquoise). Error bars indicate ±1 SE.
Figure 5
 
Summary of content-free exploration metrics in Experiments 2a and 2b. Modification of fixation duration (A), fixation rate (B), and saccade amplitude (C) across repetitive exposures (blocks 1–4). Bar plots on the left depict the results of Experiment 2a across the four blocks of repeating images. In the middle, the bar plots represent the findings from Experiment 2b, separately for repeating (turquoise) and changing (novel) images (orange) across blocks. On the right, we depict the results of the Bayesian analysis on the aggregated data from both experiments. The violin plots describe the posterior distribution of the mean difference between the first block and the last learning block (light turquoise) and between the third and the last learning block (dark turquoise). Error bars indicate ±1 SE.
The statistical analysis indicated that the duration of fixation in the last learning block indeed did not significantly differ from the first block (Experiment 2a: t(31) = 0.95, p = 0.349, d = 0.17; Experiment 2b: t(34) = −1.076, p = 0.289, d = −0.18). However, the results for the fixation rate and the saccade amplitude were less unequivocal. For the fixation rate, there was no significant difference between the first and last blocks in Experiment 2a, t(31) = 1.132, p = 0.266, d = 0.2, but a significant difference in Experiment 2b, t(34) = 2.92, p = 0.006, d = 0.49. Similar results were obtained for the saccade amplitude, where the results showed no significant difference between the first and last blocks in Experiment 2a, t(31) = 1.518, p = 0.139, d = 0.27, but a significant difference in Experiment 2b, t(34) = 2.03, p = 0.05, d = 0.34. As absence of evidence does not imply evidence of absence, and the fact that we did not find any difference between the first and last blocks in Experiment 2a, does not necessarily imply that such a difference does not exist. Therefore, we estimated the mean difference between the two blocks using Bayesian techniques, which are more suitable for assessing null effects (see below, Bayesian analysis). 
However, it was still not clear whether gaze behavior during the last learning block significantly differ from the third block. The statistical analysis revealed that exploration properties changed significantly in the last learning block when flipped images were presented. In comparison to the third block, the fixations during the observation of the flipped images were shorter (Experiment 2a: t(31) = 2.87, p = 0.007, d = 0.51; Experiment 2b: t(34) = 3.75, p < 0.001, d = 0.63), the fixation rate was higher (Experiment 2a: t(31) = −2.02, p = 0.052, d = −0.35; Experiment 2b: t(34) = −2.43, p = 0.02, d = −0.41) and the saccade amplitude were longer (Experiment 2a: t(31) = −2.31, p = 0.027, d = −0.41; Experiment 2b: t(34) = −5.47, p < 0.001, d = −0.92). This is in the opposite direction to the results of Experiment 1, in which the images in the last learning block were not flipped. 
Feature analysis (content-based)
Similar to Experiment 1, we used the decomposition of the images from Xu et al. (2014) to their semantic features and calculated the percentage of time spent fixating on the semantically meaningful parts of the image. The analytic procedure was carried out in an identical manner to the basic gaze behavior analysis. First, we conducted ANOVAs on the measures of the three initial blocks (in Experiment 2b the factor of image type was included) and then compared the last learning block (with the flipped images) to the first and third blocks using two-sample t tests. As before, in Experiment 2a we found a significant effect for block, F(2, 62) = 3.537, p = 0.035, Display Formula\({\eta _p^2} = 0.10\), which indicated that across the first three blocks, less time was spent on the meaningful parts of the image. This pattern was also apparent in Experiment 2b, where we found a significant interaction between block and image type, F(2, 68) = 3.421, p = 0.038, Display Formula\({\eta _p^2} = 0.09\). When we conducted separate ANOVAs for each image type, we only found a significant effect for block for the repeating images, F(2, 68) = 6.682, p = 0.002, Display Formula\({\eta _p^2} = 0.16\), but not for the changing ones, F(2, 68) = 0.072, p = 0.93, Display Formula\({\eta _p^2} = 0.00\)2, ruling out the effect of tonic fatigue. 
While the tendency to look less at the semantic features was evident in the course of the three first blocks, it changed in the last learning block when participants observed a flipped version of the repeating images (see Figure 6). This change resulted in a seemingly equivalent deployment of gaze towards semantic features during the first and last learning blocks (Experiment 2a: t(31) = 0.52, p = 0.603, d = 0.09; Experiment 2b: t(34) = 1.11, p = 0.274, d = 0.19). In contrast, the difference between the third and the last learning blocks did not reach significance in either experiment (Experiment 2a: t(31) = −1.74, p = 0.09, d = −0.31; Experiment 2b: t(34) = −1.79, p = 0.08, d = −0.3). To examine whether the decrease in the deployment of gaze towards semantic features of the image was a byproduct of the overall attenuation of gaze exploration (i.e., a decrease in fixation rate and saccade amplitude), we carried out the same correlation tests as in Experiment 1. Because in Experiments 2a and 2b participants saw a flipped version of the image in the fourth learning block, we calculated the decrease in each measure as the difference between the first and third learning block. If the decrease in fixation time on semantic features could be explained by the decrease in the fixation rate or in the saccade amplitude, we expected these correlations to be positive. However, both the correlation with fixation rate (Experiment 2a: Display Formula\( = - 0.5,\ p\ \gt\ 0.5\); Experiment 2b: Display Formula\(r = - 0.18,\ p\ \gt\ 0.5\)) and the correlation with the saccade amplitude (Experiment 2a: Display Formula\(r = 0.18,p = 0.16\); Experiment 2b: Display Formula\(r = - 0.32,\ p\ \gt\ 0.5\)) were not significantly larger than zero. 
Figure 6
 
Summary of content-based exploration in Experiments 2a and 2b. Percentage of fixation time on meaningful semantic features (A) and percentage of fixation time on low-level features (B) across repetitive exposures (blocks 1–4). Bar plots on the left depict the results of Experiment 2a across the four blocks. In the middle, the bar plots represent the findings of Experiment 2b separately for repeating (turquoise) and changing images (orange) across blocks. On the right, we show the results of the Bayesian analysis on the aggregated data from both experiments. The violin plots describe the posterior distribution of the mean difference between the first block and the last learning block (light turquoise) and between the third block and the last learning block (dark turquoise). Error bars indicate ±1 SE.
Figure 6
 
Summary of content-based exploration in Experiments 2a and 2b. Percentage of fixation time on meaningful semantic features (A) and percentage of fixation time on low-level features (B) across repetitive exposures (blocks 1–4). Bar plots on the left depict the results of Experiment 2a across the four blocks. In the middle, the bar plots represent the findings of Experiment 2b separately for repeating (turquoise) and changing images (orange) across blocks. On the right, we show the results of the Bayesian analysis on the aggregated data from both experiments. The violin plots describe the posterior distribution of the mean difference between the first block and the last learning block (light turquoise) and between the third block and the last learning block (dark turquoise). Error bars indicate ±1 SE.
Similar to Experiment 1, we also examined how the deployment of gaze towards salient features of the image changed across repetitive exposures. In contrast to the semantic analysis, we did not find a significant effect for block, in either Experiment 2a, F(2, 62) = 0.321, p = 0.726,Display Formula\({\rm{\ }}{\eta _p^2} = 0.01\), or Experiment 2b, F(2, 68) = 0.355, p = 0.703,Display Formula\({\rm{\ }}{\eta _p^2} = 0.01;\) see Figure 6B). These findings are consisted with Kaspar and König (2011b), who showed that the correlation between gaze behavior and low-level features did not change across repetitive exposures. This also suggests that attenuation of exploration is not a general factor influencing all types of image content. 
Bayesian analysis
In order to further examine the effect of the flipped images on gaze behavior, we used two Bayesian techniques. First, we used Bayesian parameter estimation to estimate the difference between the first and last blocks and the difference between the third and last blocks. This method enables us to model current knowledge on the parameters of interest and to update them with the empirical data. In this analysis, we combined the data from Experiments 2a and 2b, and jointly investigated the gaze patterns when observing a flipped version of familiar images. For this purpose we aggregated the data from both experiments and created two vectors of difference scores (once by subtracting the first block from the last one, and once by subtracting the third block from the last one) for each ocular measure (i.e., fixation duration, fixation rate, saccade amplitude, and proportional dwell time on semantic features and low level saliency). Our main interest was the posterior distribution of the mean of the differences; specifically, whether this distribution was around the value of zero (which is the expected value if there is no difference between the two blocks). Accordingly, for each measure we computed the 95% high posterior density intervals (HDIs, which can be considered a Bayesian alternative to confidence intervals). 
The second Bayesian technique we used is Bayesian model comparison. In contrast to parameter estimation, Bayesian model comparison can be used to compare two hypotheses directly by computing a Bayes factor (BF). The calculation of the BF was based on the Savage-Dickey density ratio method (Wagenmakers, Lodewyckx, Kuriyal, & Grasman, 2010) and relied on the different expectations as to effect size. According to the null model, no difference between the two blocks is expected, and therefore the effect size should be zero. In contrast, the alternative model suggests that the effect size differs from zero. Thus, to infer which model was more probable, we divided the height of the posterior distribution of the effect size by the height of the prior distribution of the effect size, at the point of interest (which in this case is zero). As in the Bayesian parameter analysis, we combined the data from Experiments 2a and 2b. We used the two vectors of difference scores (one for the difference between the first block and the fourth block, and one for the difference between the third block and the fourth one) to estimate the posterior distribution of the effect size. We repeated this procedure for each of the ocular measures (i.e., fixation duration, fixation rate, saccade amplitude, and proportional dwell time on semantic and low-level features). 
We used Just Another Gibbs Sampler (JAGS; Plummer, 2013) for both Bayesian parameter estimation and Bayesian model comparison. For both analyses, we used reasonable non-informative priors for all parameters to prevent biasing of the posterior distribution by the chosen prior. A detailed presentation of the statistical models used, as well as the different priors, can be found in the supplementary materials (Supplementary File S1 and Supplementary Figure S1). 
In general, the results of the Bayesian analysis indicated that when observing a flipped version of a familiar image, gaze behavior departed from the behavior observed when an unflipped image was displayed. Specifically, the results of Experiment 1 suggested that when viewing the exact same image, the duration of fixation should increase between the third and fourth blocks (thus, the difference between the fourth block and the third one should be positive). However, the Bayesian estimation indicated that when participants viewed a flipped version of the image during the fourth block, the mean of the difference between the fourth and third blocks was −16.2 ms with a 95% HDI of [−23.7, −8.3]. Hence, the duration of fixations when viewing the flipped images is likely to have decreased between the third and fourth blocks. This is strong evidence that the tendency to have longer fixations on familiar images was reversed when the same familiar images were presented in a flipped configuration. The Bayesian model comparison analysis supported these results by showing that the alternative model, which suggested that the effect size of the difference between the fourth and third blocks was different than zero, was considerably more probable (BF10 = 321.2). 
Moreover, the Bayesian parameter estimation analysis showed that the difference in fixation duration between the first and fourth block was −0.1 ms on average, with a 95% HDI of [−6.7, 6.5]. A comparison of the two models showed that the null model, in which the effect size is zero, was 10 times more probable than the alternative one (BF10 = 0.1). Therefore, not only were the flipped images scanned with shorter fixations in comparison to the third presentation of the image, the length of these fixations was similar to the duration of the fixations during the first view of the image (see Figure 5A). 
When considering the changes in fixation rate across repetitive exposures, Experiment 1 suggested it should decrease between the third and last learning blocks (such that the difference between the last and third learning block should be negative). However, participants actually made more fixations when observing the flipped images, leading to a mean difference of 0.07 between the third and last learning blocks, with a 95% HDI of [0.02, 0.13]. This pattern of results was supported by the Bayesian model comparison, which favored the alternative model over the null model (BF10 = 10.68). Interestingly, in contrast to the duration of the fixation, the increase in the fixation rate in the fourth block was not large enough to return to the fixation rate in the initial view of the image. Instead, the mean difference between the last and first learning blocks was −0.075 with a 95% HDI of [−0.13, −0.02]. In addition, a comparison of the two models yielded a preference for the alternative model (BF10 = 5). Thus, it is more likely that fixation rate during the last learning block was still lower than the rate in the first one (see Figure 5B). 
A similar pattern of results emerged for the amplitude of the saccades. According to Experiment 1, the saccade amplitude should decrease between the third and fourth presentation of the same image (thus, the mean difference between the last learning block and the third one should be negative). Yet, when the images in the fourth block were presented in a flipped configuration, the mean difference between the fourth and third blocks was positive with an average of 0.46° and a 95% HDI of [0.28, 0.66]. A comparison of the two models supported the conclusion that the amplitude of the saccade in the fourth block was different from the third block, by showing that the alternative model was considerably more probable (BF10 > Display Formula\({10^5}\)). 
Similar to the fixation rate, the increase in the amplitude of the saccades in the fourth block was not large enough to return to the saccade amplitude in the initial view of the image. Specifically, the mean difference between the last and first learning blocks was −0.30° with a 95% HDI of [−0.53, −0.06]. A comparison of the two models showed a slight preference for the alternative model (BF10 = 2.04). Thus, it is more likely that the amplitude of the saccades during the last learning block was still smaller than the amplitude in the first one (see Figure 5C), but the evidence appeared rather weak. 
Finally, we conducted a Bayesian analysis of gaze behavior towards the different features of the image. For the semantic features, this analysis showed that although we would anticipate a decrease in the percentage of fixation time on meaningful areas of the image during the fourth learning block (based on the results of Experiment 1), the percentage of fixation time actually increased when the image was flipped. Specifically, the mean difference between the last and third blocks was 2.02% with a 95% HDI of [0.25, 3.61]. A comparison of the models indicated slight support for the alternative model, which suggests that the effect size was different from zero (BF10 = 2). However, when computing the difference between the last learning block and the first one, the mean difference was −0.74% with a 95% HDI of [−2.17, 0.68]. In this case, the null model, was more probable (BF10 = 0.16), which suggest that the effect size equaled zero. Thus, gaze exploration of meaningful areas when viewing a familiar image in a flipped configuration appeared to resemble the pattern of exploration during the first view (see Figure 6A). 
We also carried out a Bayesian analysis for the fixation time on the low-level features of the image. In this case, when computing the difference between the last learning block and the first one, the mean difference was 0.61% with a 95% HDI of [−0.47, 1.66]. A comparison of the models favored the null model, suggesting a similar pattern in the first and last learning blocks (BF10 = 0.18). The comparison between the third and last learning blocks yielded a mean difference of 0.93% with a 95% HDI of [−0.004, 1.98], and inconclusive results regarding the favorable model (BF10 = 0.57). It is important to note that while there was a consistent trend for fixation time on the semantic features (a gradual decrease along the first three blocks and a change of trend when viewing the flipped images), no consistent trend emerged for fixation time on the low-level features (see Figure 6B). Accordingly, the differences between two blocks may not reflect a general influence of repetitive exposures across the whole course of the experiment. 
Discussion
In three experiments, we examined several unresolved issues in visual cognition research. Specifically, by exploiting a rather simple paradigm that included repetitive presentations of the same images together with images that changed across blocks, we demonstrated that the changes in gaze behavior are indeed due to familiarity with the specific images and not due to a confounding variable in the experimental design (i.e., time of display). Importantly, this study was designed to explore the impact of familiarity on two types of gaze behavior: content-free (i.e., deployment of fixations and saccades independently of the content of the viewed image) and content-based (deployment of fixations towards specific regions of the image based on their semantic and visual properties). Recently, a growing body of research has underscored the importance of high-level image content during visual exploration by showing an interaction between the semantic properties of the image and gaze position (End & Gamer, 2017; Henderson & Hayes, 2017). While these studies provide convincing arguments for the influence of image meaning during visual exploration, they do not address how gaze predilection towards semantic properties is modulated by familiarity. Moreover, even studies in the field of memory-guided gaze have only referred to characteristics of content-free gaze and neglected the effect of high-level content-based visual exploration. In the current study, we attempted to fill this gap by investigating how fixation time on semantically meaningful features of an image change across repetitive exposures to the same image. Thus, combining this analysis with the traditional investigation of content-free gaze behavior provides a comprehensive framework to investigate the modification of gaze behavior as a function of familiarity. 
Taken together, the results of all three experiments replicated previous findings for content-free gaze effects (Bradley et al., 2011; Heisz & Shore, 2008; Kaspar & König, 2011a). Our analysis showed that across repetitive exposures, fixation duration increased and fixation rate as well as saccade amplitude decreased. Interestingly, beyond the content-free effects, in all three experiments we also found novel content-based effects across exposures. Specifically, whereas gaze was typically drawn toward semantically meaningful regions of the image, the percentage of overall fixation time on these regions decreased as familiarity increased. Although the effect sizes in this analysis were rather small, these effects were apparent in all experiments. This suggest that people may pay more attention to meaningful properties of the image at first sight. In addition, the attenuation of gaze towards semantically meaningful regions of the image cannot be solely explained by attenuation in the amount of fixation time towards visually salient low-level features of the image. Specifically, in all experiments, there was a decrease in the proportional fixation time on meaningful regions of the image, but a decrease in fixation time on salient features was only obtained in Experiment 1. The fact that we only found a significant effect in Experiment 1, where all blocks consisted only repeated images (i.e., there were no changing images) might imply that decreased gaze allocation towards low-level features along repetitive exposures relates to the monotony of the task. In addition, the observed content-free effects (i.e., the decrease in fixation rate and saccade amplitude across repetitive exposures) were not positively correlated across participants with the decrease in allocation of gaze towards semantic meaningful regions. Therefore, the content-based effects cannot be explained by the general attenuation of gaze exploration. 
Crucially, the attenuation of gaze exploration was only manifested for the repeating images but not for the changing ones. That is, when participants observed novel images (i.e., changing), the duration of their fixations, their fixation rate and the amplitude of their saccades were stable across blocks. Similarly, participants tended to allocate gaze equally towards semantically meaningful regions of the changing images across blocks. Since the sole difference between the changing and repeating images was previous exposure, we can confidently attribute the pattern of attenuated gaze behavior to familiarity, rather than to tonic fatigue. It is important to note that our results do not rule out a possible influence of phasic fatigue (i.e., momentary fatigue due to current circumstances; e.g., the current stimulus). That is, the attenuation of gaze toward familiar images might be due to boredom with the repeating images. However, our pattern of results indicates that this presume phasic fatigue is abolished for changing and flipped images, possibly due to temporary arousal. Since this arousal is tightly related to the familiarity of the participants with the images, this alternative explanation does not undermine our interpretation of results that emphasize the role of familiarity in exploration attenuation. 
Our findings are consistent with studies reporting an interaction between representations in long term memory and gaze behavior, which showed fewer fixations or longer ones (Althoff & Cohen, 1999; Lancry-Dayan, Nahari, Ben-Shakhar, & Pertzov, 2018; Peth et al., 2013; Peth, Suchotzki, & Gamer, 2016; Schwedes & Wentura, 2012) directed to personally familiar faces. Althoff and Cohen (1999) interpreted this eye-movement–based memory effect as an optimization of sampling behavior on unfamiliar faces. Here, we show that this interpretation is applicable not only to familiar faces, but also to familiar scenes. Moreover, we demonstrate that this change in exploratory behavior emerged not only for fixations, but also for saccades. Since previous studies (Bradley et al., 2011; Kaspar & König, 2011a) have shown similar effects also for different tasks (i.e., free view) this change in the nature of processing may not necessarily be related to the specific task at hand. Importantly, beyond showing similar gaze pattern modifications between faces and scenes, the use of scenes enabled us to explore other aspects of memory-guided gaze effects. Specifically, we showed that the pattern of reduced sampling emerged when considering the high-level content of the image, in that the participants directed their gaze less toward meaningful regions of the image as familiarity increased. Interestingly, a previous study (Antes, 1974) showed that during a single observation of an image (for 20 s) the mean informativeness of fixated locations decreased with viewing time. Thus, as scenes become more familiar (whether over several observations or during a single long inspection), observers may initially pay attention to the meaningful regions of the scene and only then continue to scrutinize less informative regions. 
Our analysis suggests that the effects of repetitive exposures on gaze behavior are indeed a result of increased familiarity. However, it is still not clear which aspect of familiarity (verbatim familiarity or gist familiarity) elicits these effects. In order to resolve this issue, during the final learning block in the last two experiments, participants saw a flipped version of the repeating images. Importantly, in this final block, participants were familiar with the gist of the images, but the configuration of the details in the images was new. Therefore, this design enabled us to distinguish between the gist and verbatim familiarity and examine how it influences gaze behavior; that is, if the attenuation tendency persisted when participants observed the flipped image, it would indicate that gist familiarity was the dominant influential element. However, if the trend inversed and gaze behavior returned to its initial pattern in the first block, this would suggest that verbatim familiarity was the main factor. Any other pattern of results would imply the combined influence of both gist and verbatim familiarity. Interestingly, the trend of exploration attenuation was reversed during the inspection of the flipped images, thus indicating that verbatim familiarity with the image was necessary for eliciting the memory-guided gaze effects. Moreover, in the fixation duration and the semantic analyses, observation of the flipped images was similar to the initial viewing of the original image. Bayesian analyses validated that the measures returned to their initial values. Therefore, the influence of gist familiarity on these two exploration metrics can be considered negligible. In contrast, in the fixation rate and saccade amplitude analysis, gaze behavior during the last learning block did not reach its initial level in the first block. Thus, it can be concluded the number of fixations and the length of saccades may be influenced by both gist and verbatim familiarity. 
The differentiation between verbatim and gist memory traces was originally formulated in the fuzzy-trace theory as an approach to cognitive development (Reyna & Brainerd, 1995), but was quickly harnessed to provide simple explanatory principles for the phenomenon of false memories (Brainerd & Reyna, 2002). This theory posits that when the general meaning of the input is preserved, verbatim and gist traces have opposite effects on false memory; namely, whereas gist traces support false memories of novel items with a similar meaning to pre-experienced items, verbatim traces help to identify these items as unfamiliar. In the current study, we provide a proof of concept that the impact of gist and verbatim traces on visual exploration can be differentiated. To the best of our knowledge, this is the first study to examine how memory-guided gaze behavior is differentially affected by these two aspects of familiarity. Since gist and verbatim familiarity underpin the phenomenon of false memory, our study paves the way for future research investigating how gaze behavior diverges as a function of true and false memories. This application of eye tracking measures to studies of false memories might shed new light on attentional processes during encoding and retrieval of visual information mistakenly judged as familiar. Importantly, beyond the possible contribution of gaze behavior to the field of false memories, insights as to the neuronal mechanism of false memories might contribute to uncovering the mechanism of memory-guided gaze behavior. Specifically, studies on false memories have found a dissociation between two regions of the medial temporal lobe during the elicitation of true and false memories: the hippocampus was activated similarly for true and false memories, but the parahippocampal gyrus was more highly activated for true than for false memories (Cabeza, Rao, Wagner, Mayer, & Schacter, 2001). Our results suggest that the effect of familiarity on gaze behavior depends on verbatim traces that elicit true recollection. Hence, activation in the parahippocampal gyrus may be more strongly related to memory-guided effects than the hippocampus. 
Beyond the contribution of these findings to theories of visual exploration, mapping the ways gaze behavior is modified by memory is important for applicative reasons. Specifically, advanced knowledge regarding the modulation of gaze by familiarity might have practical implications for forensic and security purposes by revealing information that suspects may try to conceal (Gamer & Pertzov, 2018). Previous studies have indicated that eye-tracking technology can be implemented to differentiate between familiar and unfamiliar items, based on the scanning patterns of these items (Lancry-Dayan et al., 2018; Nahari, Lancry-Daya, Ben-Shakhar & Pertzov, 2018; Peth et al., 2013; Peth et al., 2016; Schwedes & Wentura, 2012, 2016). Because deduction of familiarity is sometimes required under conditions of weak memory traces (for example, the identification of a rapist by the victim) it is important to understand how the modulation of gaze by memory evolves for different levels of familiarity, and not only for highly familiar items. 
Although the findings here contribute to the literature on memory-guided gaze, several questions remain unexplored. First, while we found a general effect of gaze behavior towards meaningful regions of the image, it is still not clear how this effect is modulated for specific semantic categories (such as faces; Guy et al., 2019). In the current design, we aimed to investigate the overall effect of semantically meaningful features and selected images that diverged in terms of their semantic properties. Thus, we did not have enough images and items in each specific semantic category to investigate the specific influence of different categories. Future studies should focus on defined semantic categories and choose the images accordingly. Second, whereas the flipped version manipulated the different aspects of familiarity it might have also induced a higher rate of arousal. Thus, it is possible that the change in trend when observing the flipped images could have been mediated by arousal (note however that no significant difference in gaze behavior was observed in the new images that were intermixed among the flipped images, thus, precluding the effect of tonic arousal). A future study might further examine this by measuring physiological responses together with the ocular measures. In addition, the current study only dealt with short time periods (participants saw the repetitive images consecutively within the same session). Therefore, the generalization of the effects of familiarity on gaze behavior to longer periods (i.e., days and weeks) is limited. To further understand whether such effects are maintained over longer periods, at least two experimental sessions are required. This type of study could shed light on the interaction between memory-guided effects and the quality of memory. In our experiment we could not investigate whether these effects were correlated with memory accuracy, due to the ceiling effect (in all experiments accuracy was above 97%). In a two-session study, accuracy rates are likely to be lower, enabling the analysis of the relationship between memory-guided gaze effects and memory performance. 
The current study generates new insights as to the interaction between memory and gaze behavior. Specifically, this research emerged from previous findings that demonstrated attenuated gaze exploration along repetitive exposures. However, while these results were interpreted as memory-guided gaze effects, only the current study validated that familiarity is indeed the cause of this modulation of gaze behavior. Once the role of familiarity was established, we further examined how the modification of gaze behavior by familiarity evolved over time. Our investigation focused in two main directions. First, we expanded previous findings by showing that content-based gaze behavior modulates across repetitive exposures. Specifically, we found that semantically meaningful items are fixated less as familiarity develops. Second, we delved into the mechanism of familiarity that elicits these effects, by showing that verbatim familiarity is the main factor involved in attenuated exploration behavior, whereas the gist familiarity has a lesser effect. 
Acknowledgments
This study was supported by the Israel Science Foundation (ISF), grant 1747/14 to YP. 
Commercial relationships: none. 
Corresponding author: Oryah C. Lancry-Dayan. 
Address: Department of Psychology, Hebrew University of Jerusalem, Mount Scopus, Jerusalem, Israel. 
References
Althoff, R. R., & Cohen, N. J. (1999). Eye-movement-based memory effect: A reprocessing effect in face perception. Journal of Experimental Psychology: Learning, Memory, and Cognition, 25 (4), 997–1010.
Antes, J. R. (1974). The time course of picture viewing. Journal of Experimental Psychology, 103 (1), 62–70.
Bradley, M. M., Houbova, P., Miccoli, L., Costa, V. D., & Lang, P. J. (2011). Scan patterns when viewing natural scenes: Emotion, complexity, and repetition. Psychophysiology, 48 (11), 1544–1553.
Brainerd, C. J., & Reyna, V. F. (2002). Fuzzy-trace theory and false memory. Current Directions in Psychological Science, 11 (5), 164–169.
Cabeza, R., Rao, S. M., Wagner, A. D., Mayer, A. R., & Schacter, D. L. (2001). Can medial temporal lobe regions distinguish true from false? An event-related functional MRI study of veridical and illusory recognition memory. Proceedings of the National Academy of Sciences, USA, 98 (8), 4805–4810.
End, A., & Gamer, M. (2017). Preferential processing of social features and their interplay with physical saliency in complex naturalistic scenes. Frontiers in Psychology, 8: 418, 1–16.
Foulsham, T., & Underwood, G. (2008). What can saliency models predict about eye movements? Spatial and sequential aspects of fixations during encoding and recognition. Journal of Vision, 8 (2): 6, 1–17, https://doi.org/10.1167/8.2.6. [PubMed] [Article]
Gamer, M., & Pertzov, Y. (2018). Detecting concealed knowledge from ocular responses. In Rosenfeld J. P. (Ed.), Detecting concealed information and deception (pp. 169–186). Amsterdam, the Netherlands: Elsevier.
Guy, N., Azulay, H., Kardosh, R., Weiss, Y., Hassin, R. R., Israel, S., & Pertzov, Y. (2019). A novel perceptual trait: Gaze predilection for faces during visual exploration. Scientific Reports, 9 (1): 10714, https://doi.org/10.1038/s41598-019-47110-x.
Harding, G., & Bloj, M. (2010). Real and predicted influence of image manipulations on eye movements during scene recognition. Journal of Vision, 10 (2): 8, 1–17, https://doi.org/10.1167/10.2.8. [PubMed] [Article]
Heisz, J. J., & Shore, D. I. (2008). More efficient scanning for familiar faces. Journal of Vision, 8 (1): 9, 1–10, https://doi.org/10.1167/8.1.9. [PubMed] [Article]
Henderson, J. M., & Hayes, T. R. (2017). Meaning-based guidance of attention in scenes as revealed by meaning maps. Nature Human Behaviour, 1 (10), 743–747.
Itti, L. (2005). Quantifying the contribution of low-level saliency to human eye movements in dynamic scenes. Visual Cognition, 12 (6), 1093–1123.
Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20 (11), 1254–1259.
Kaspar, K., & König, P. (2011a). Overt attention and context factors: The impact of repeated presentations, image type, and individual motivation. PLoS One, 6 (7): e21719.
Kaspar, K., & König, P. (2011b). Viewing behavior and the impact of low-level image properties across repeated presentations of complex scenes. Journal of Vision, 11 (13): 26, 1–29, https://doi.org/10.1167/11.13.26. [PubMed] [Article]
Koch, C., & Ullman, S. (1987). Shifts in selective visual attention: Towards the underlying neural circuitry. In Vaina L. M. (Ed.), Matters of intelligence: Conceptual structures in cognitive neuroscience (pp. 115–141). Dordrecht, the Netherlands: D. Reidel Publishing.
Lancry-Dayan, O. C., Nahari, T., Ben-Shakhar, G., & Pertzov, Y. (2018). Do you know him? Gaze dynamics toward familiar faces on a concealed information test. Journal of Applied Research in Memory and Cognition, 7 (2), 291–302.
McGuinness, K., & O'Connor, N. E. (2010). A comparative evaluation of interactive segmentation algorithms. Pattern Recognition, 43 (2), 434–444.
Nahari, T., Lancry-Dayan, O., Ben-Shakhar, G., & Pertzov, Y. (2019). Detecting concealed familiarity using eye movements: The role of task demands. Cognitive Research, 4: 10, https://doi.org/10.1186/s41235-019-0162-7.
Parkhurst, D. J., & Niebur, E. (2004). Texture contrast attracts overt visual attention in natural scenes. European Journal of Neuroscience, 19 (3), 783–789.
Peth, J., Kim, J. S., & Gamer, M. (2013). Fixations and eye-blinks allow for detecting concealed crime related memories. International Journal of Psychophysiology, 88 (1), 96–103.
Peth, J., Suchotzki, K., & Gamer, M. (2016). Influence of countermeasures on the validity of the Concealed Information Test. Psychophysiology, 53 (9), 1429–1440.
Plummer, M. (2013). rjags: Bayesian graphical models using MCMC. R Package Version, 3 (10).
Reyna, V. F., & Brainerd, C. J. (1995). Fuzzy-trace theory: An interim synthesis. Learning and Individual Differences, 7 (1), 1–75.
Ryan, J. D., Hannula, D. E., & Cohen, N. J. (2007). The obligatory effects of memory on eye movements. Memory, 15 (5), 508–525.
Schwedes, C., & Wentura, D. (2012). The revealing glance: Eye gaze behavior to concealed information. Memory & Cognition, 40 (4), 642–651.
Schwedes, C., & Wentura, D. (2016). Through the eyes to memory: Fixation durations as an early indirect index of concealed knowledge. Memory & Cognition, 44 (8), 1244–1258.
Underwood, G., Foulsham, T., & Humphrey, K. (2009). Saliency and scan patterns in the inspection of real-world scenes: Eye movements during encoding and recognition. Visual Cognition, 17 (6–7), 812–834.
Wagenmakers, E.-J., Lodewyckx, T., Kuriyal, H., & Grasman, R. (2010). Bayesian hypothesis testing for psychologists: A tutorial on the Savage–Dickey method. Cognitive Psychology, 60 (3), 158–189.
Xu, J., Jiang, M., Wang, S., Kankanhalli, M. S., & Zhao, Q. (2014). Predicting human gaze beyond pixels. Journal of Vision, 14 (1): 28, 1–20, https://doi.org/10.1167/14.1.28. [PubMed] [Article]
Footnotes
1  One of the purposes of Experiment 2a was to investigate whether the familiarity effects could be explained by tonic fatigue. Therefore, we included novel changing images in each block. In Experiment 2a, we found significant differences in gaze behavior between the repeating images and the changing ones as of the first presentation (see supplementary material), despite the fact that in the first block it was the first presentation of all images. These differences may be due to the fact that in Experiment 2a all participants saw the same images in the repeating and changing sets. Since these sets differed consistently in various ways (e.g., the amount of semantic information in the image), it could lead to a significant difference in gaze behavior already in the initial observation of the images. Accordingly, in the analysis of Experiment 2a we discarded the data related to the changing images and considered only the repeating ones. The comparison between the repeating and changing images is further discussed in Experiment 2b, in which we counterbalanced the changing and repeating sets of images across participants.
Figure 1
 
Experimental design for Experiment 1. Participants were repetitively exposed to a set of 40 images across four blocks. The order of images in each block was randomized. On the recognition block, participants saw a set of 80 images and were asked to indicate by a key press whether each image was old or new.
Figure 1
 
Experimental design for Experiment 1. Participants were repetitively exposed to a set of 40 images across four blocks. The order of images in each block was randomized. On the recognition block, participants saw a set of 80 images and were asked to indicate by a key press whether each image was old or new.
Figure 2
 
Summary of content-free gaze behavior in Experiment 1. Modification of fixation durations (left), fixation rate (middle), and saccade amplitude (right) across repetitive exposures (blocks 1–4). Error bars indicate ±1 SE.
Figure 2
 
Summary of content-free gaze behavior in Experiment 1. Modification of fixation durations (left), fixation rate (middle), and saccade amplitude (right) across repetitive exposures (blocks 1–4). Error bars indicate ±1 SE.
Figure 3
 
Summary of content-based exploration in Experiment 1. Percentage of fixation time on semantically meaningful features (left) and percentage of fixation time on salient low-level features (right) across repetitive exposures (blocks 1–4). Error bars indicate ±1 SE.
Figure 3
 
Summary of content-based exploration in Experiment 1. Percentage of fixation time on semantically meaningful features (left) and percentage of fixation time on salient low-level features (right) across repetitive exposures (blocks 1–4). Error bars indicate ±1 SE.
Figure 4
 
The experimental design in Experiments 2a and 2b. Participants were exposed to a set of 40 images in each of the four blocks. Half of these images were repetitively displayed across all blocks (turquoise frame), and half were novel and changed between blocks (orange frame). The images in each block appeared in random order, and were presented for 5,000 ms. Importantly, on the fourth block, the repeating images were flipped. On the recognition block, participants saw a set of 40 images and were asked to report by a key press whether each image was old or new.
Figure 4
 
The experimental design in Experiments 2a and 2b. Participants were exposed to a set of 40 images in each of the four blocks. Half of these images were repetitively displayed across all blocks (turquoise frame), and half were novel and changed between blocks (orange frame). The images in each block appeared in random order, and were presented for 5,000 ms. Importantly, on the fourth block, the repeating images were flipped. On the recognition block, participants saw a set of 40 images and were asked to report by a key press whether each image was old or new.
Figure 5
 
Summary of content-free exploration metrics in Experiments 2a and 2b. Modification of fixation duration (A), fixation rate (B), and saccade amplitude (C) across repetitive exposures (blocks 1–4). Bar plots on the left depict the results of Experiment 2a across the four blocks of repeating images. In the middle, the bar plots represent the findings from Experiment 2b, separately for repeating (turquoise) and changing (novel) images (orange) across blocks. On the right, we depict the results of the Bayesian analysis on the aggregated data from both experiments. The violin plots describe the posterior distribution of the mean difference between the first block and the last learning block (light turquoise) and between the third and the last learning block (dark turquoise). Error bars indicate ±1 SE.
Figure 5
 
Summary of content-free exploration metrics in Experiments 2a and 2b. Modification of fixation duration (A), fixation rate (B), and saccade amplitude (C) across repetitive exposures (blocks 1–4). Bar plots on the left depict the results of Experiment 2a across the four blocks of repeating images. In the middle, the bar plots represent the findings from Experiment 2b, separately for repeating (turquoise) and changing (novel) images (orange) across blocks. On the right, we depict the results of the Bayesian analysis on the aggregated data from both experiments. The violin plots describe the posterior distribution of the mean difference between the first block and the last learning block (light turquoise) and between the third and the last learning block (dark turquoise). Error bars indicate ±1 SE.
Figure 6
 
Summary of content-based exploration in Experiments 2a and 2b. Percentage of fixation time on meaningful semantic features (A) and percentage of fixation time on low-level features (B) across repetitive exposures (blocks 1–4). Bar plots on the left depict the results of Experiment 2a across the four blocks. In the middle, the bar plots represent the findings of Experiment 2b separately for repeating (turquoise) and changing images (orange) across blocks. On the right, we show the results of the Bayesian analysis on the aggregated data from both experiments. The violin plots describe the posterior distribution of the mean difference between the first block and the last learning block (light turquoise) and between the third block and the last learning block (dark turquoise). Error bars indicate ±1 SE.
Figure 6
 
Summary of content-based exploration in Experiments 2a and 2b. Percentage of fixation time on meaningful semantic features (A) and percentage of fixation time on low-level features (B) across repetitive exposures (blocks 1–4). Bar plots on the left depict the results of Experiment 2a across the four blocks. In the middle, the bar plots represent the findings of Experiment 2b separately for repeating (turquoise) and changing images (orange) across blocks. On the right, we show the results of the Bayesian analysis on the aggregated data from both experiments. The violin plots describe the posterior distribution of the mean difference between the first block and the last learning block (light turquoise) and between the third block and the last learning block (dark turquoise). Error bars indicate ±1 SE.
Supplement 1
Supplement 2
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×