Open Access
Article  |   June 2019
Spatial statistics for gaze patterns in scene viewing: Effects of repeated viewing
Author Affiliations
  • Hans A. Trukenbrod
    University of Potsdam, Potsdam, Germany
    hans.trukenbrod@uni-potsdam.de
  • Simon Barthelmé
    Centre National de la Recherche Scientifique, Gipsa-lab, Grenoble Institut National Polytechnique, France
  • Felix A. Wichmann
    Eberhard Karls University of Tübingen, Tübingen, Germany
    Bernstein Center for Computational Neuroscience Tübingen, Tübingen, Germany
    Max Planck Institute for Intelligent Systems, Tübingen, Germany
  • Ralf Engbert
    University of Potsdam, Potsdam, Germany
Journal of Vision June 2019, Vol.19, 5. doi:10.1167/19.6.5
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Hans A. Trukenbrod, Simon Barthelmé, Felix A. Wichmann, Ralf Engbert; Spatial statistics for gaze patterns in scene viewing: Effects of repeated viewing. Journal of Vision 2019;19(6):5. doi: 10.1167/19.6.5.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Scene viewing is used to study attentional selection in complex but still controlled environments. One of the main observations on eye movements during scene viewing is the inhomogeneous distribution of fixation locations: While some parts of an image are fixated by almost all observers and are inspected repeatedly by the same observer, other image parts remain unfixated by observers even after long exploration intervals. Here, we apply spatial point process methods to investigate the relationship between pairs of fixations. More precisely, we use the pair correlation function, a powerful statistical tool, to evaluate dependencies between fixation locations along individual scanpaths. We demonstrate that aggregation of fixation locations within 4° is stronger than expected from chance. Furthermore, the pair correlation function reveals stronger aggregation of fixations when the same image is presented a second time. We use simulations of a dynamical model to show that a narrower spatial attentional span may explain differences in pair correlations between the first and the second inspection of the same image.

Introduction
When we move our eyes during scene viewing, fixation locations are not selected uniformly. Instead fixations cluster in specific areas while other areas remain unfixated by observers, even after long exploration intervals. Most research on this inhomogeneity has tried to identify factors that contribute to the placement of fixations across trials, while statistical correlations within trials have mostly been ignored. Here we describe the pair correlation function (PCF), a method from spatial statistics, to investigate the relationship between fixation locations of individual scanpaths. In a first part, we provide a step-by-step tutorial how the PCF can be applied to eye movement data and demonstrate that fixations in a scanpath are more aggregated than expected by the distribution of fixation locations of all subjects (Engbert, Trukenbrod, Barthelmé, & Wichmann, 2015). In a second part, we show that a long-term memory manipulation, i.e., the second inspection of an image, leads to even more aggregation. We discuss the results in the light of simulations of the SceneWalk model of Engbert et al. (2015), a dynamical model for the generation of saccadic sequences in scene viewing. Furthermore, we demonstrate with simulated fixation locations that the PCF can be used to test specific hypotheses and that seemingly similar distributions of fixation locations may lead to very different PCFs. 
Eye movements during scene perception
Fixation locations during scene perception are influenced by bottom-up and top-down as well as low-level and high-level factors (see Schütt, Rothkegel, Trukenbrod, Engbert, & Wichmann, 2018, for an extensive discussion). Bottom-up factors refer to parts of the image that attract gaze independent of the internal state of an observer and might differ in complexity. Simple low-level features are extracted early in the visual hierarchy while high-level features are complex shapes and are extracted late in the visual hierarchy. Examples of bottom-up, low-level features that predict fixation locations are luminance contrast and edge density (Mannan, Ruddock, & Wooding, 1997; Reinagel & Zador, 1999; Tatler, Baddeley, & Gilchrist, 2005). The strength of the relationship for different image features depends on the type of image viewed (Parkhurst, Law, & Niebur, 2002). Examples of bottom-up, high-level features that predict fixation locations are objects (e.g., faces, persons, cars; Cerf, Harel, Einhäuser, & Koch, 2007; Einhäuser, Spain, & Perona, 2008; Judd, Ehinger, Durand, & Torralba, 2009). This interpretation is supported by the existence of a preferred viewing location close to the object center in scene viewing (Nuthmann & Henderson, 2010). 
The influence of bottom-up factors has led to the development of computational saliency models (Itti & Koch, 2001; cf. Koch & Ullman, 1985) and a variety of models have been put forward over the years (e.g., Bruce & Tsotsos, 2009; Kienzle, Franz, Schölkopf, & Wichmann, 2009; Kümmerer, Wallis, & Bethge, 2016; Parkhurst et al., 2002; Vig, Dorr, & Cox, 2014; Zhang, Tong, Marks, Shan, & Cottrell, 2008). In particular since the rise of sophisticated machine-learning algorithms, these models perform well in predicting fixation locations when evaluated with a data set obtained under unconstrained (“free”) viewing (Bylinskii et al., 2015). 
Top-down factors refer to cognitive influences on fixation locations and depend on the internal state of an observer (e.g., aims, memory load, knowledge). As for bottom-up factors, top-down factors differ in respect to the complexity of features. Examples of top-down, low-level factors can be found in visual search, where observers search for a specific color or a line of a specific orientation. Examples of top-down, high-level factors are task instructions where observers need to judge the age of people or their wealth. In his seminal work, Yarbus (1967) reported anecdotal evidence for top-down control. Scanpaths were influenced by the instruction given before viewing an image (see also Castelhano, Mack, & Henderson, 2009). These top-down effects strengthen when observers are engaged in natural tasks like preparation of a sandwich or during tea-making (Hayhoe & Ballard, 2005; Land & Hayhoe, 2001). In natural tasks, fixations generally support the smooth execution of a task and occur on objects just-in-time (Ballard, Hayhoe, & Rao, 1997) or as look-ahead fixations to inspect objects needed later during the task (Pelz & Canosa, 2001). Thus, the eyes do not necessarily fixate the most salient location (bottom-up) but are rather directed toward informative locations that are important for task execution. Another source of top-down control comes from memory representations due to reinspection of previously seen images (Kaspar & König, 2011a, 2011b) and due to the acquired scene or world knowledge (Torralba, Oliva, Castelhano, & Henderson, 2006). Incorporating such knowledge by contextual priors improves the predictions of saliency models (Judd et al., 2009; Torralba et al., 2006). 
In addition, fixation locations depend on systematic tendencies that are common to eye movements in general (Tatler & Vincent, 2008). These systematic tendencies lead to spatial and directional biases in the selection of fixation locations. A well-known example is the central fixation bias during scene viewing (Tatler, 2007). On average, participants prefer to fixate near the center of an image rather than toward the periphery. The central fixation bias is strongest in the beginning of a trial and reaches an asymptotic level after a few fixations (Rothkegel, Trukenbrod, Schütt, Wichmann, & Engbert, 2017). Other systematic tendencies include the preference to generate positively skewed, long-tailed distributions of saccade amplitudes or the preference to execute saccades in the cardinal directions during scene viewing (Tatler & Vincent, 2008). The later effect is shaped by image features and varies systematically with the perceived horizon. Tilting an image results in an equally tilted distribution of saccade directions (Foulsham & Kingstone, 2010; Foulsham, Kingstone, & Underwood, 2008). In addition, observers tend to make saccades in the same direction as the preceding saccade (cf. saccadic momentum) and a large number of fixations bring the eyes back to the last or penultimate fixation location (cf. facilitation of return; Smith & Henderson, 2009; Wilming, Harst, Schmidt, & König, 2013). Finally, inhibition of return is believed to facilitate exploration during visual search (Klein & MacInnes, 1999) and has been suggested as a mechanism to drive attention during scene perception (Itti & Koch, 2001). Inhibition of return during scene perception seems to primarily prolong fixation durations before return saccades to previously fixated locations (Hooge, Over, van Wezel, & Frens, 2005; Smith & Henderson, 2009) and seems to shape spatial dynamics of scanpaths in the long run (Rothkegel, Trukenbrod, Schütt, Wichmann, & Engbert, 2016). Adding systematic tendencies to saliency models further improves their predictions (Le Meur & Liu, 2015). 
Eye movements and long-term memory: Repeated presentation
Humans have a remarkable capacity to store images in long-term memory (Brady, Konkle, Alvarez, & Oliva, 2008; Konkle, Brady, Alvarez, & Oliva, 2010; Standing, Conezio, & Haber, 1970). These representations are not limited to the gist of a scene but include abstract representations of objects, in particular of previously fixated objects (Hollingworth & Henderson, 2002; Hollingworth, Williams, & Henderson, 2001). By viewing the same image multiple times scene memory accumulates (Melcher, 2001; Melcher & Kowler, 2001). Due to this scene-specific memories repeated presentations can be used to investigate effects of long-term memory on eye movements. 
In two studies, Kaspar and König (2011a, 2011b) studied the interaction of bottom-up and top-down influences and long-term memories by presenting images five times. In general, fixation locations tend to be similar across different inspections of the same image by the same participant (Kaspar & König, 2011b). Similarity was strongest for successive presentations and decreased with increasing distance between presentations. The correlation between fixation locations and low-level features, however, remained rather constant. In contrast, the number of fixated regions decreased after multiple presentations of images, as did the average number of fixations. Furthermore, saccade amplitudes were largest during the first presentation and decreased on subsequent presentations. 
In a second study Kaspar and König (2011a) explored the effects of repeated presentations on top-down influences on saccade selection. Motivation (as measured via the reported interestingness of the viewed images) and a personality trait of participants (action orientation) influenced repeated viewing of images. In addition, fixation durations and variability between participants' fixation locations increased, whereas saccade frequency, saccade length, and entropy of fixation locations decreased. The authors concluded that the locus of attention became increasingly local with repeated presentations of images. Thus, participants scrutinized individual regions during later presentations. This interpretation was supported by participants' self-reports and was augmented in participants who found the images more interesting. 
Research questions
Much progress has been made to understand where observers fixate in an image. This research primarily focused on fixation locations across observers while neglecting the fixation history during a trial of a single observer. Fixation locations, however, exhibit strong spatial correlations during a trial and are not independent of one another (Engbert et al., 2015) and adding mechanisms that generate more realistic scanpaths improves performance of saliency models (Le Meur & Liu, 2015). Barthelmé, Trukenbrod, Engbert, and Wichmann (2013) introduced spatial point processes as a theoretical framework for the study of gaze patterns, and demonstrated how this helps to turn qualitative into quantitative questions. Here, we present a method from spatial statistics, i.e., the PCF, to estimate spatial correlations between fixation locations during a trial in the presence of spatial inhomogeneity (Engbert et al., 2015). Before applying the PCF to eye movement data during the first and second inspection of an image, we briefly describe the theoretical details of the PCF. We demonstrate (a) that the PCF provides rigorous statistical evidence for aggregation of fixation locations in single trials, (b) that this effect is not well explained by the tendency of participants to generate short saccade amplitudes, (c) that the PCF is differentially affected by a memory manipulation (first vs. second inspection of an image), and (d) that these differences can be explained by modulations of the attentional span within our SceneWalk model (Engbert et al., 2015). 
Pair correlation function
We refer the reader to Diggle (2013) for an introduction to the statistical analysis of point patterns and to Law et al. (2009) for a detailed description of the PCF and the application to point patterns in plant ecology. 
Analyzing spatial point patterns: PCF
The density estimation of a point pattern, i.e., the probability of observing a point at a given location, is a first-order statistic for a spatial point process, which, therefore, plays the role of the mean value in classical statistics. In the upcoming sections we denote this first-order statistic as the intensity λ(x). In the case of eye movements the intensity represents the local average spatial density of fixations at a location x. Point patterns that are generated by a homogeneous point process are uniformly distributed and the underlying intensity λ(x) = λ is constant for all x. For inhomogeneous point processes, where the two-dimensional (2-D) density of fixation locations is non-uniformly distributed, the intensity λ(x) is estimated for each location x separately, i.e.,  
\(\def\upalpha{\unicode[Times]{x3B1}}\)\(\def\upbeta{\unicode[Times]{x3B2}}\)\(\def\upgamma{\unicode[Times]{x3B3}}\)\(\def\updelta{\unicode[Times]{x3B4}}\)\(\def\upvarepsilon{\unicode[Times]{x3B5}}\)\(\def\upzeta{\unicode[Times]{x3B6}}\)\(\def\upeta{\unicode[Times]{x3B7}}\)\(\def\uptheta{\unicode[Times]{x3B8}}\)\(\def\upiota{\unicode[Times]{x3B9}}\)\(\def\upkappa{\unicode[Times]{x3BA}}\)\(\def\uplambda{\unicode[Times]{x3BB}}\)\(\def\upmu{\unicode[Times]{x3BC}}\)\(\def\upnu{\unicode[Times]{x3BD}}\)\(\def\upxi{\unicode[Times]{x3BE}}\)\(\def\upomicron{\unicode[Times]{x3BF}}\)\(\def\uppi{\unicode[Times]{x3C0}}\)\(\def\uprho{\unicode[Times]{x3C1}}\)\(\def\upsigma{\unicode[Times]{x3C3}}\)\(\def\uptau{\unicode[Times]{x3C4}}\)\(\def\upupsilon{\unicode[Times]{x3C5}}\)\(\def\upphi{\unicode[Times]{x3C6}}\)\(\def\upchi{\unicode[Times]{x3C7}}\)\(\def\uppsy{\unicode[Times]{x3C8}}\)\(\def\upomega{\unicode[Times]{x3C9}}\)\(\def\bialpha{\boldsymbol{\alpha}}\)\(\def\bibeta{\boldsymbol{\beta}}\)\(\def\bigamma{\boldsymbol{\gamma}}\)\(\def\bidelta{\boldsymbol{\delta}}\)\(\def\bivarepsilon{\boldsymbol{\varepsilon}}\)\(\def\bizeta{\boldsymbol{\zeta}}\)\(\def\bieta{\boldsymbol{\eta}}\)\(\def\bitheta{\boldsymbol{\theta}}\)\(\def\biiota{\boldsymbol{\iota}}\)\(\def\bikappa{\boldsymbol{\kappa}}\)\(\def\bilambda{\boldsymbol{\lambda}}\)\(\def\bimu{\boldsymbol{\mu}}\)\(\def\binu{\boldsymbol{\nu}}\)\(\def\bixi{\boldsymbol{\xi}}\)\(\def\biomicron{\boldsymbol{\micron}}\)\(\def\bipi{\boldsymbol{\pi}}\)\(\def\birho{\boldsymbol{\rho}}\)\(\def\bisigma{\boldsymbol{\sigma}}\)\(\def\bitau{\boldsymbol{\tau}}\)\(\def\biupsilon{\boldsymbol{\upsilon}}\)\(\def\biphi{\boldsymbol{\phi}}\)\(\def\bichi{\boldsymbol{\chi}}\)\(\def\bipsy{\boldsymbol{\psy}}\)\(\def\biomega{\boldsymbol{\omega}}\)\(\def\bupalpha{\unicode[Times]{x1D6C2}}\)\(\def\bupbeta{\unicode[Times]{x1D6C3}}\)\(\def\bupgamma{\unicode[Times]{x1D6C4}}\)\(\def\bupdelta{\unicode[Times]{x1D6C5}}\)\(\def\bupepsilon{\unicode[Times]{x1D6C6}}\)\(\def\bupvarepsilon{\unicode[Times]{x1D6DC}}\)\(\def\bupzeta{\unicode[Times]{x1D6C7}}\)\(\def\bupeta{\unicode[Times]{x1D6C8}}\)\(\def\buptheta{\unicode[Times]{x1D6C9}}\)\(\def\bupiota{\unicode[Times]{x1D6CA}}\)\(\def\bupkappa{\unicode[Times]{x1D6CB}}\)\(\def\buplambda{\unicode[Times]{x1D6CC}}\)\(\def\bupmu{\unicode[Times]{x1D6CD}}\)\(\def\bupnu{\unicode[Times]{x1D6CE}}\)\(\def\bupxi{\unicode[Times]{x1D6CF}}\)\(\def\bupomicron{\unicode[Times]{x1D6D0}}\)\(\def\buppi{\unicode[Times]{x1D6D1}}\)\(\def\buprho{\unicode[Times]{x1D6D2}}\)\(\def\bupsigma{\unicode[Times]{x1D6D4}}\)\(\def\buptau{\unicode[Times]{x1D6D5}}\)\(\def\bupupsilon{\unicode[Times]{x1D6D6}}\)\(\def\bupphi{\unicode[Times]{x1D6D7}}\)\(\def\bupchi{\unicode[Times]{x1D6D8}}\)\(\def\buppsy{\unicode[Times]{x1D6D9}}\)\(\def\bupomega{\unicode[Times]{x1D6DA}}\)\(\def\bupvartheta{\unicode[Times]{x1D6DD}}\)\(\def\bGamma{\bf{\Gamma}}\)\(\def\bDelta{\bf{\Delta}}\)\(\def\bTheta{\bf{\Theta}}\)\(\def\bLambda{\bf{\Lambda}}\)\(\def\bXi{\bf{\Xi}}\)\(\def\bPi{\bf{\Pi}}\)\(\def\bSigma{\bf{\Sigma}}\)\(\def\bUpsilon{\bf{\Upsilon}}\)\(\def\bPhi{\bf{\Phi}}\)\(\def\bPsi{\bf{\Psi}}\)\(\def\bOmega{\bf{\Omega}}\)\(\def\iGamma{\unicode[Times]{x1D6E4}}\)\(\def\iDelta{\unicode[Times]{x1D6E5}}\)\(\def\iTheta{\unicode[Times]{x1D6E9}}\)\(\def\iLambda{\unicode[Times]{x1D6EC}}\)\(\def\iXi{\unicode[Times]{x1D6EF}}\)\(\def\iPi{\unicode[Times]{x1D6F1}}\)\(\def\iSigma{\unicode[Times]{x1D6F4}}\)\(\def\iUpsilon{\unicode[Times]{x1D6F6}}\)\(\def\iPhi{\unicode[Times]{x1D6F7}}\)\(\def\iPsi{\unicode[Times]{x1D6F9}}\)\(\def\iOmega{\unicode[Times]{x1D6FA}}\)\(\def\biGamma{\unicode[Times]{x1D71E}}\)\(\def\biDelta{\unicode[Times]{x1D71F}}\)\(\def\biTheta{\unicode[Times]{x1D723}}\)\(\def\biLambda{\unicode[Times]{x1D726}}\)\(\def\biXi{\unicode[Times]{x1D729}}\)\(\def\biPi{\unicode[Times]{x1D72B}}\)\(\def\biSigma{\unicode[Times]{x1D72E}}\)\(\def\biUpsilon{\unicode[Times]{x1D730}}\)\(\def\biPhi{\unicode[Times]{x1D731}}\)\(\def\biPsi{\unicode[Times]{x1D733}}\)\(\def\biOmega{\unicode[Times]{x1D734}}\)\begin{equation}\tag{1}\lambda (x) = \mathop {\lim }\limits_{|dx| \to 0} \left\{ {{{E[N(dx)]} \over {|dx|}}} \right\}.\end{equation}
Here, the density is given by the expected number of fixations E[N] falling into a disc of infinitesimal size |dx|.  
While first-order statistics are concerned with locations of single points (while ignoring spatial correlations between points), second-order statistics describe the relation between pairs of points. A crucial second-order statistic for the computation of PCFs is the pair density ρ(r). The pair density (or second-order intensity function) describes the probability of simultaneously observing points generated by a point process in two discs with centers x and y of infinitesimal size dx and dy,  
\begin{equation}\tag{2}\rho (x,y) = \mathop {\lim }\limits_{|dx|,|dy| \to 0} \left\{ {{{E[N(dx)N(dy)]} \over {|dx||dy|}}} \right\}.\end{equation}
 
For a stationary, isotropic point process the pair density ρ(x, y) = ρ(r) depends on the distances of pairs of points only, where r corresponds to the distance between pairs of points ||xy|| (Diggle, 2013). Mathematically we estimate the pair density Display Formula\(\hat \rho (r)\) at distance r by  
\begin{equation}\tag{3}\hat \rho (r) = \sum\limits_{x,y \in W}^ {\ne} {{{k(||x - y|| - r)} \over {2\pi r{A_{||x - y||}}}}} \end{equation}
where k represents an Epanechnikov kernel1 (Baddeley, Rubak, & Turner, 2015) and Display Formula\({A_{||x - y||}}\) is an edge-correction factor to counter the loss of pairs of points near the boundary of the inspection window, which is particularly important at large distances Display Formula\(||x - y||\). As edge correction we chose to use the translation correction that weights each pair of points (xi, xj) with the reciprocal of the fraction of the window area, in which a first point xi could be placed, so that both points xi, xj would be observable (Baddeley et al., 2015).  
Since the pair density depends on the number of points, the PCF is computed as a normalized version of the pair density (i.e., the PCF is given by the intensity-weighted pair density). In the case of a homogeneous PCF the pair density is weighted by a constant at all distances,  
\begin{equation}\tag{4}{g_{hom}}(r) = {1 \over {{\lambda ^2}}}\rho (r).\end{equation}
 
However, fixation locations are not distributed homogeneously and show aggregation due to bottom-up, top-down, and systematic oculomotor factors. Fortunately, the PCF can be computed for inhomogeneous processes by weighting the pair density with the intensities λ(xi) of an inhomogeneous point process at each fixation location,  
\begin{equation}\tag{5}{g_{inhom}}(r) = \sum\limits_{x,y \in W}^ {\ne} {{1 \over {\lambda (x)\lambda (y)}}} \rho (r)\end{equation}
 
The resulting PCFs will be nonnegative, g(r) ≥ 0, at all distances r. Values of the PCF close to one, g(r) ≈ 1, indicate that pairs of points at distance r are independent. Points at distance r occur solely due to the underlying intensity λ(x). For larger values, i.e., g(r) > 1, point patterns are more abundant at distance r than expected by the intensity λ(x). Thus, pairs of points at distance r interact and observing a point x increases the probability of observing a point y at distance r. The probability of observing point y is higher than predicted by the local intensity λ(y). Conversely, smaller values, g(r) < 1, reveal that points are less abundant than the spatial average at distance r. Observing a point reduces the probability of observing a second point at distance r
Figure 1 shows the PCF of three different point patterns. In all examples we computed the PCF assuming a homogeneous point process with constant intensity λ(x) = λ (cf. Equation 4) since deviations from uniformity are easier to interpret visually. The same interpretation, however, can be applied to inhomogeneous PCFs. The first example shows a regular point pattern (left column). Visual inspection of the points indicate a grid-like arrangement. The distance between neighboring points is relatively constant. The resulting PCF (bottom row) summarizes this behavior. At short distances, r < 4, the PCF reveals a strong inhibitory effect, g(r) ≈ 0. The existence of a point impedes the occurrence of other points within this radius. At medium distances, 0 < r < 6, the PCF reveals aggregation of points, g(r) > 1. Observing a point boosts the occurrence of points at this distance. Hence, the grid-like appearance. At larger distances, r > 6, the PCF lends support to the hypothesis of independence of points, since g(r) ≈ 1. We observe no long-range interaction of pairs of points and the distribution of distances can be explained by the density distribution, λ(x). 
Figure 1
 
Examples of point patterns (upper panels): regular, random, and aggregated (clustered). Corresponding PCFs (lower panels).
Figure 1
 
Examples of point patterns (upper panels): regular, random, and aggregated (clustered). Corresponding PCFs (lower panels).
The second example (Figure 1, central panels) shows the realization of a point pattern with complete spatial randomness (CSR). Points are distributed uniformly. The resulting PCF reveals the independence of points at all distances, g(r) ≈ 1. Note, the aggregation at short distances r is an artifact generated by the estimation process. Finally, the third example illustrates an aggregated point process (right panels). The PCF at short distances, r < 2, reveals aggregation, g(r) > 1, while the PCF at longer distances reveals independence, g(r) ≈ 1. Thus, observing a point increases the likelihood of observing other points in close proximity. The occurrence of distant points can be explained by the uniform distribution. 
Finally, deviations from complete spatial randomness, i.e., g(r) ≠ 1, can be summed up and serve as a useful summary statistic of the overall behavior:  
\begin{equation}\tag{6}\chi = \int_0^\infty ( g(r) - 1{)^2}\,dr.\end{equation}
 
In practical applications, the deviation from complete spatial randomness Display Formula\(\chi \) is computed over a finite interval, e.g.,Display Formula\(\int_a^b ( g(r) - 1)^2\,dr\) with a < b. 
Application of the PCF to fixation locations
In this section we demonstrate how to compute the PCFs for eye movement data in three steps. The PCF reveals whether the distribution of fixation locations during a single scanpath can be explained by the overall inhomogeneity observed across all observers or whether fixation locations of a single scanpath contain additional spatial correlations. All analyses and graphs reported have been implemented in R using the spatstat (Baddeley & Turner, 2005; Baddeley et al., 2015) and ggplot2 packages (Wickham, 2009). R-code of our analysis can be found here: http://www.rpubs.com/Hans/PCF
Step 1: Simulate inhomogeneous and homogeneous control processes
To evaluate our PCF computations, we simulate two control point processes, namely a homogeneous and an inhomogeneous point process. Points (fixation locations) are sampled independently from each other in both control processes, and due to the independence of points, we do not expect to observe any correlations between points at distance r. Any observed correlations would be spurious and depend on the data structure (e.g., length of fixation sequences) or a wrong parameterization of the method. Hence, both control processes ensure that correlations in the PCF arise from the empirical data and not by the method itself. In addition, the inhomogeneous point process is used in the second step to estimate an optimal bandwidth for the intensity estimation of the PCF. 
Figure 2 (left panel) shows fixations on an image from participants viewing the same scene for 10 s (see Methods for details). As expected fixation locations are not uniformly distributed and indicate inhomogeneity. The estimated intensity Display Formula\({\hat \lambda _s}(x)\) of all fixation locations is depicted by gray shading where darker areas represent higher intensities. We used Scott's rule of thumb to compute the smoothing bandwidth for the intensity estimation (R function: bw.scott) from the spatstat package (Baddeley & Turner, 2005) and estimated an optimal bandwidth for each image. The estimated intensity Display Formula\({\hat \lambda _s}(x)\) is used to simulate an inhomogeneous control point process. The inhomogeneous point process (central panel) samples points proportionally to the intensity Display Formula\({\hat \lambda _s}(x)\). Hence, the resemblance of the experimental and simulated distributions. The homogeneous point process (right panel) samples from a uniform distribution across the entire image area; therefore, the subsequently estimated intensity (gray shading) is approximately constant, Display Formula\({\hat \lambda _s}(x) \approx \lambda \). For every empirical scanpath we simulated one scanpath of equal length (same number of fixations) for the inhomogeneous point process and for the homogeneous point process obtained from Monte Carlo simulations. 
Figure 2
 
Distribution of fixation locations. Dots represent fixation locations; estimated intensities of each point pattern are illustrated by gray shading. Experimental fixation locations (red, left panel), locations generated by an inhomogeneous point process (green, central panel), and a homogeneous point process (blue, right panel).
Figure 2
 
Distribution of fixation locations. Dots represent fixation locations; estimated intensities of each point pattern are illustrated by gray shading. Experimental fixation locations (red, left panel), locations generated by an inhomogeneous point process (green, central panel), and a homogeneous point process (blue, right panel).
Examples of the empirical and simulated scanpaths are visualized in Figure 3. Each row shows matching scanpaths of the empirical, inhomogeneous, and homogeneous point process. The estimated overall intensity of each point process on an image is displayed via gray shading. Fixations are likely to be located in areas of high average intensity, Display Formula\(\hat \lambda (x)\). However, each sequence consists of a unique set of points where some scanpaths explore otherwise ignored or “missed” locations of high intensity (Figure 3). Overall, scanpaths of the inhomogeneous and homogeneous point processes reveal less systematic exploration behavior than the empirical data. Hence, saccade amplitudes increase considerably. 
Figure 3
 
Three representative scanpaths during viewing of an image. Each row represents the scanpath during a single trial of the empirical data (red, left panel), the simulated inhomogeneous point process (green, central panel), and the simulated homogeneous point process (blue, right panel).
Figure 3
 
Three representative scanpaths during viewing of an image. Each row represents the scanpath during a single trial of the empirical data (red, left panel), the simulated inhomogeneous point process (green, central panel), and the simulated homogeneous point process (blue, right panel).
Step 2: Choose optimal bandwidth for intensity estimation of PCF
Next we need to choose an optimal bandwidth for estimation of the intensity Display Formula\(\hat \lambda (x)\) used to calculate the inhomogeneous PCF Display Formula\(\hat g(r)\) (see Equation 5). Note, this is different from the bandwidth estimated in Step 1 to simulate the inhomogeneous point process. Since fixation locations in scanpaths of both control point processes are sampled independent from the preceding fixation history, average PCFs of both point processes are expected to reveal no spatial correlations, i.e., we expect Display Formula\(\hat g(r) \approx 1\) at all distances r. We computed the deviation from complete spatial randomness (Equation 6) for PCFs computed with different bandwidths. We varied bandwidths from 0° and 10° in steps of 0.1° and computed the deviation from complete spatial randomness for each scanpath. The average deviation at each bandwidth is plotted in Figure 4. Lines represent individual images. For all images, the deviation increases for small bandwidths and large bandwidths with an optimal bandwidth between 1.5° and 5°. The bandwidth yielding the smallest deviation was chosen for the intensity estimation of the PCF. 
Figure 4
 
Optimal bandwidth for intensity estimation of PCF. The deviation from complete spatial randomness (cf. Equation 6) is evaluated for bandwidths between 0.1° and 10° in steps of 0.1° for each image.
Figure 4
 
Optimal bandwidth for intensity estimation of PCF. The deviation from complete spatial randomness (cf. Equation 6) is evaluated for bandwidths between 0.1° and 10° in steps of 0.1° for each image.
Step 3: Compute PCF for each trial
In the last step we compute the PCF of our empirical data. For estimation, we use the intensity Display Formula\(\hat \lambda (x)\) that resulted in the smallest deviation from complete spatial randomness of the PCF Display Formula\(\hat g(r)\) for the inhomogeneous point process (see previous step). PCFs of individual scanpaths on an image are displayed in Figure 5 (gray lines). The three example scanpaths from Figure 3 are plotted in black. PCFs vary strongly between individual trials for all point processes. The average empirical PCF across all scanpaths on an image (red line) deviates from complete spatial randomness Display Formula\(\hat g(r) \ne 1\) for distances smaller than 4°. At distances beyond 4° the average PCF suggests independence of points, i.e., Display Formula\(\hat g(r) \approx 1\). Thus, fixations co-occur in close proximity during individual trials. Conversely, areas further away are fixated as predicted by chance, i.e., the overall inhomogeneity Display Formula\(\hat \lambda (x)\) observed across all participants. Inspection of the control point processes demonstrates the absence of spatial correlations. The average PCF of the inhomogeneous and homogeneous point process are constant with Display Formula\(\hat g(r) \approx 1\). The artifact of the estimation procedure at short distances is present in all estimates. 
Figure 5
 
PCFs of individual scanpaths (gray lines) and the average PCF on an image for the experimental data (red, left panel), inhomogeneous point process (green, central panel), and homogeneous point process (blue, right, panel). PCFs of trials from Figure 3 are displayed in black.
Figure 5
 
PCFs of individual scanpaths (gray lines) and the average PCF on an image for the experimental data (red, left panel), inhomogeneous point process (green, central panel), and homogeneous point process (blue, right, panel). PCFs of trials from Figure 3 are displayed in black.
The same procedure can be repeated for each image. Figure 6 shows PCFs of each image averaged over participants (gray lines) as well as the average across all images (colored lines). While inhomogeneous and homogeneous point processes reveal no spatial correlations, empirical PCFs show spatial aggregation at short distances, r < 4° in all conditions. For a more detailed discussion see the Results section. 
Figure 6
 
PCFs of individual images. PCFs were estimated for each condition separately (natural vs. texture images; first vs. second inspection). Estimated PCFs of the empirical fixation locations (red lines), simulated locations generated by an inhomogeneous point process (green lines), and an homogeneous point process (blue lines).
Figure 6
 
PCFs of individual images. PCFs were estimated for each condition separately (natural vs. texture images; first vs. second inspection). Estimated PCFs of the empirical fixation locations (red lines), simulated locations generated by an inhomogeneous point process (green lines), and an homogeneous point process (blue lines).
Methods
For our experiment we had participants view two types of images twice. The repeated presentation of images can be understood as a form of visual long-term memory manipulation. From previous work it can be expected that this leads to similar fixation densities but shortens saccade amplitudes during the second inspection (Kaspar & König, 2011a, 2011b). The resulting point patterns are similar but differ slightly in the overall inhomogeneity, which makes a direct comparison of the eye movement behavior difficult. The same problem is true for the comparison of different image types. The PCF takes differences in the underlying inhomogeneity into account and allows a direct comparison of the spatial correlations under different viewing conditions (first vs. second presentation) and for different image types (natural vs. texture images). 
Participants
We recorded eye movements of 35 participants (15 male, 20 female) aged 17–36 years (M = 24.0). Participants received study credits or 8€ for participation and were recruited at the University of Postdam and from a local school (32 students from the University of Potsdam, three pupils from a local high-school). All participants had normal or corrected-to-normal vision as assessed by the Freiburg Vision Test (Bach, 1996).2 The study conformed to the national ethics guidelines. We obtained written informed consent from all participants. 
Apparatus
Stimuli were presented on a 20-in. CRT monitor (Mitsubishi Diamond Pro 2070; refresh rate 120 Hz; resolution: 1280 × 1024 pixels). Eye movements were recorded binocularly using the video-based Eyelink 1000 system (SR Research, Osgoode, ON, Canada) with a sampling rate of 1000 Hz. In order to reduce head movements we asked participants to position their head on a chinrest in front of the computer screen (viewing distance: 70 cm). Stimulus presentation and response collection were implemented in MATLAB (MathWorks, Natick, MA) using the Psychophysics Toolbox (Brainard, 1997; Kleiner et al., 2007; Pelli, 1997) and Eyelink Toolbox (Cornelissen, Peters, & Palmer, 2002). 
Stimulus material
We used two sets of colored photographs in our experiment. In the first part of the experiment each set consisted of 15 images (Figure 7). Image Set 1 contained photographs of natural landscapes and rural scenes. Image Set 2 contained photographs of textures. During the memory test in the second part of the experiment we added 15 novel images of the same category to each image set. 
Figure 7
 
Images used in experiment. Left: natural scenes; right: texture images.
Figure 7
 
Images used in experiment. Left: natural scenes; right: texture images.
Task and procedure
Each trial began with the presentation of a black fixation cross in front of a uniform gray background. The fixation cross was placed randomly within the image boundaries. After successful fixation, the fixation cross was replaced by the image for 10 s. Participants were instructed to explore each image for a subsequent memory test. The first block of the experiment consisted of 60 trials where each image was presented twice. Results from the first inspection of images have been described previously (Engbert et al., 2015). The second inspection has not been published earlier. In a second block participants completed a memory test with 60 trials. The memory test contained all presented images and 30 new images with natural scenes and textures. Participants had to judge whether the image had been presented during Block 1. 
To minimize the potential influence of the monitor frame and since accuracy of eye trackers falls off towards the edges of a monitor, images were presented centrally with gray borders extending 32 pixels to the top/bottom and 40 pixels to the left/right of the image. The resulting size of the image was 1200 × 960 pixels (31.1° × 24.9°). 
Data preprocessing
We detected saccades by using a velocity-based algorithm (Engbert & Kliegl, 2003; Engbert & Mergenthaler, 2006). Saccades were defined as fast movements of both eyes that exceeded the average velocity during a trial by 6 SDs for at least 6 ms with a minimum amplitude of 0.5°. Eye traces between successive saccades were tagged as fixations. Fixation positions were computed by averaging the mean eye position of both eyes. Since trials started with a fixation check, first fixations on images were removed from the data set (N = 2,100). In addition, fixations containing a blink or with a blink during an adjacent saccade were excluded from subsequent analyses (N = 2,214). Overall, 55.526 fixations remained for further analyses. 
SceneWalk model
For the interpretation of our results, we simulated fixation sequences with the SceneWalk model (Engbert et al., 2015; cf. Schütt et al., 2017). Fixation sequences are generated in the computational model by two competing activation maps: (a) an excitatory attention map that provides potential saccade targets and (b) an inhibitory fixation map that tags previously fixated locations. Maps in the model are discretized with dimensions k × l = 128 × 128 (Stensola et al., 2012). Activations in the attention map aij at coordinates (i, j) evolve over time. To approximate potential saccade targets we compute the empirical density of fixation locations on a given image. The empirical density contains all effects generated by bottom-up and top-down processing that contribute to the inhomogeneous distribution of fixation locations. In our model simulations, the empirical density feeds into the attention map. Extraction from the empirical density is highest at fixation and decreases with increasing eccentricity. This corresponds to the attentional window of our model. Mathematically, the empirical density of fixation locations is weighted by a Gaussian envelope of size σ1. Position of the attentional window aij changes after each saccade and remains constant otherwise. In addition leakage leads to a temporal decay of activations. The updating rule of the attention map aij is given by  
\begin{equation}\tag{7}{a_{ij}}(t + 1) = {{{\phi _{ij}}{A_{ij}}(t)} \over {\sum\limits_{kl} {{\phi _{kl}}} {A_{kl}}(t)}} + (1 - \rho ){a_{ij}}(t)\end{equation}
with a rate of decay ρ and a normalized point-wise product of a 2-D Gaussian Aij (t) centered upon fixation at time t and the distribution of fixation locations ϕij.  
Temporal evolution of activations in the inhibitory map fij is very similar to the dynamics in the attention map aij. Temporal evolution consists of activation accumulation centered at fixation and a proportional temporal decay across the map. The updating rule for the fixation map fij is given by  
\begin{equation}\tag{8}{f_{ij}}(t + 1) = {F_{ij}}(t) + (1 - \omega ){f_{ij}}(t)\end{equation}
with a 2-D Gaussian Fij (t) with standard deviation σ0 centered at fixation at time t and a decay rate of the fixation map ω. The fixation map tracks fixated areas and is motivated by inhibition of return (Klein & MacInnes, 1999), which has been suggested as a mechanism to drive exploration in scenes (Itti & Koch, 2001). Although the role of inhibition of return has been questioned (Smith & Henderson, 2009), model simulations support inhibitory tagging as an important mechansim during scene perception (Rothkegel et al., 2016; cf. Bays & Husain, 2012).  
While attention and fixation maps evolve independently over time, both maps are subsequently normalized and combined into a single map for target selection. The potential uij is given by  
\begin{equation}\tag{9}{u_{ij}}(t) = - {{{{[{a_{ij}}(t)]}^\lambda }} \over {\sum\limits_{kl} {{{[{a_{kl}}(t)]}^\lambda }} }} + {{{{[{f_{ij}}(t)]}^\gamma }} \over {\sum\limits_{kl} {{{[{f_{kl}}(t)]}^\gamma }} }}\end{equation}
where the exponents λ and γ are free parameters. Engbert et al. (2015) fixed these parameters to λ = 1 to reproduce the densities of gaze positions and γ = 0.3 to reproduce spatial correlations between fixation locations. We kept these values in our simulations. The probability of a location (i, j) to be chosen as the next saccade target can be extracted from the potential, i.e.,  
\begin{equation}\tag{10}{\pi _{ij}}(t) = \max \left( {{{{u_{ij}}(t)} \over {\sum\limits_{(k,l) \in S} {{u_{kl}}} (t)}},\eta } \right)\end{equation}
where S contains all positions on the grid with uij (t) ≤ 0 and a free parameter η that adds noise to the selection process so that every position has at least a minimal probability to be chosen as the next saccade target. Target selection in the SceneWalk model occurs at the end of fixation where the eyes move instantaneously. The intervals between successive saccades were drawn from a Gamma distribution with a shape parameter of 9 and a scale parameter of ∼0.031, which corresponds to a mean fixation duration of 275 ms.  
Parameter estimation
Simulations were based on different parameters for each of the four experimental conditions (presentation × image type). As a starting point we used the parameters reported in Engbert et al. (2015). These parameters were estimated for fixations during the first presentation of natural images. We hypothesized that image type (natural scenes vs. textures) might affect target selection and decided to estimate all parameters for the first presentation of texture images anew. Previous work suggested that reinspection of images leads to a decreased attentional span (Kaspar & König, 2011a, 2011b). Hence, we decided to fix all parameters except for the sizes of the attention span σ1 and inhibition span σ0 for simulation of the second inspection of images. We used a genetic algorithm approach (Mitchell, 1998) to estimate model parameters. Parameter estimation was based on the first five images for each image type. The remaining 10 images in each image set were used for model evaluations. Limiting the analysis to the predicted images did not alter effects. We used first-order statistics (2-D density of fixation locations) and the distribution of saccade lengths as an objective function to evaluate parameters. A list of estimated parameter values and standard errors can be found in Table 1
Table 1
 
Model parameters. Note: Standard errors were calculated from five parameter estimations of the experimental data recorded during the first presentation of natural images.
Table 1
 
Model parameters. Note: Standard errors were calculated from five parameter estimations of the experimental data recorded during the first presentation of natural images.
Control model: Joint probability of saccade amplitude and fixation density
To investigate the influence of saccade amplitudes on the aggregation of points, we simulated a control model that generates both realistic distributions of fixation positions and saccade amplitudes. Fixation sequences began at the initial fixation position of the empirically observed scanpath and subsequent fixation positions were simulated iteratively. The next fixation position was chosen proportional to the joint probability of the empirical density of saccade amplitudes and the empirical density of fixation positions on an image. For each empirical fixation sequence we simulated a scanpath with the same number of fixations to exclude effects due to differences in the number of fixations or sequences. The optimal bandwidth for density estimation was chosen by applying Scott's rule of thumb (bw.scott) for fixation densities and unbiased cross-validation (bw.ucv) for saccade amplitudes. 
Statistical modeling
For statistical analyses, we computed linear mixed-effect models for each dependent variable using the lme4 package (Bates, Mächler, Bolker, & Walker, 2015) in R (R Core Team, 2018). We log-transformed both dependent variables, since they deviated considerably from normal distributions. For the statistical model of the empirical data, we used the maximal possible random effect structure (Barr, Levy, Scheepers, & Tily, 2013) and ensured that none of the models was degenerate (Bates, Kliegl, Vasishth, & Baayen, 2015). For our results we interpret all |t| > 2 as significant fixed effects (Baayen, Davidson, & Bates, 2008). 
In R the linear mixed effect model for the empirical data can be written as  
\begin{equation}\tag{11}\log({\rm{dv}})\sim \underbrace {\rm{1 + A + B + A\!:\!B}}_{{\rm{fixed\ effects}}}\underbrace {\, + \,({\rm{1 + A + B + A\!:\!B\,|\,id}})}_{{\rm{random\ effects\ id}}}\underbrace { + \,({\rm{1 + A\,|\,img}})}_{{\rm{random\ effects\ image}}}\end{equation}
with a fixed effects of presentation A, image type B, and their interaction A:B and the corresponding random effect structure. As each image belongs to only one image type, image type is a between factor for images and was not included in the random effect structure for images.  
The random effect structure for the simulated data differed from Equation 11, since fixation sequences were only based on one participant, that is, the average participant. Therefore, we did not estimate random effects for participants. Due to convergence problems we also removed the random slope of presentation for each image. This reduced the random effect structure to a single intercept per image for the simulated data. 
Results
With our first analysis we expected to replicate the results of Kaspar and König (2011a, 2011b) that a second inspection of the same image leads to decreased saccade amplitudes. In addition, we investigated whether the two image types (natural scenes vs. textures) differentially affected saccade amplitudes. In a second analysis, we tested the sensitivity of the PCF to our experimental manipulations. All experimental results were compared to model simulations of the SceneWalk model (Engbert et al., 2015) and the joint probability control model that reproduces fixation densities and saccade amplitudes. In addition, we display the results of the inhomogeneous and the homogeneous point processes. These processes are plotted to demonstrate that results of the PCF are not generated by the method itself. 
Saccade amplitudes
Figure 8 shows the distribution of saccade amplitudes on natural scenes (top) and texture images (bottom) during the first and second presentation (solid vs. dashed lines). As expected, empirical saccade amplitudes distributions are positively skewed and long-tailed. Distributions of saccade amplitudes of the SceneWalk model and the joint probability control model deviate less from the empirically observed saccade amplitudes than the inhomogeneous and homogeneous point processes. At closer inspection, the SceneWalk model generates slightly shorter amplitudes while the joint probability control model generates slightly longer saccade amplitudes than our participants. 
Figure 8
 
Saccade amplitudes. Experiment (red), and results of model simulations by the SceneWalk model (purple), a joint probability model of saccade amplitudes and fixation density (brown), an inhomogeneous point process (green), and a homogeneous point process (blue). Note, scales of the x-axis differ between the first three and the last two point processes.
Figure 8
 
Saccade amplitudes. Experiment (red), and results of model simulations by the SceneWalk model (purple), a joint probability model of saccade amplitudes and fixation density (brown), an inhomogeneous point process (green), and a homogeneous point process (blue). Note, scales of the x-axis differ between the first three and the last two point processes.
A linear mixed effects model (LME) for the experimental data (see Methods section; Bates et al., 2015) revealed a significant fixed effect of presentation but no effect of image type and no interaction (Table 2, left columns). Saccades were larger during the first inspection than during the second inspection. LME models of saccade amplitudes generated by the SceneWalk model and the joint probability control model replicated this effect qualitatively. However, the interaction of presentation and image type also reached significance for the SceneWalk model. The reduction of saccade amplitudes during the second inspection was stronger on natural scenes than on texture images. 
Table 2
 
Fixed effects of linear mixed effect models. For each point process we estimated separate models for saccade amplitudes (log-transformed) and the summed deviation from complete spatial randomness of the PCF (log-transformed). The table reports estimates of fixed effects (β) with standard errors (SE) and t values. |t| > 2 are interpreted as significant effects.
Table 2
 
Fixed effects of linear mixed effect models. For each point process we estimated separate models for saccade amplitudes (log-transformed) and the summed deviation from complete spatial randomness of the PCF (log-transformed). The table reports estimates of fixed effects (β) with standard errors (SE) and t values. |t| > 2 are interpreted as significant effects.
Second order statistics: PCF
We computed inhomogeneous PCFs for each condition of the five point processes (Figure 9). We observed spatial correlations of fixation locations during individual trials in all conditions of our experimental data (red lines). Fixations locations were more abundant than expected from the overall inhomogeneity of fixation locations at short distances r. The estimated PCFs deviated from complete spatial randomness, i.e., Display Formula\(\hat g(r) \gt 1\), at distances r < 4°. More importantly, the second presentation of an image (dashed lines) led to increased PCFs for both natural scenes (top row) and texture images (bottom row). Statistically, we evaluated spatial correlations by computing the deviation from complete spatial randomness of each PCF, i.e., the summed deviation of the PCF from one for distances 0.1 ≤ r ≤ 6.5 (Figure 10; cf. Equation 6). An LME revealed a significant deviation from complete spatial randomness (intercept) and an effect of presentation (Table 2). All other fixed effects were nonsignificant. Thus, deviations from complete spatial randomness were present in all conditions of our experiment with larger deviations during the second inspection of an image irrespective of image type. 
Figure 9
 
Average PCFs for experimental and simulated point processes. PCFs were estimated for each condition separately (natural vs. texture images: top vs. bottom row; first vs. second inspection: solid vs. dashed lines). Estimated PCFs for the observed fixation locations (red lines), and simulated locations generated by the SceneWalk model (purple lines), a joint probability model of saccade amplitudes and fixation density (brown lines), an inhomogeneous point process (green lines), and a homogeneous point process (blue lines). The 95% confidence intervals represent the variability across images.
Figure 9
 
Average PCFs for experimental and simulated point processes. PCFs were estimated for each condition separately (natural vs. texture images: top vs. bottom row; first vs. second inspection: solid vs. dashed lines). Estimated PCFs for the observed fixation locations (red lines), and simulated locations generated by the SceneWalk model (purple lines), a joint probability model of saccade amplitudes and fixation density (brown lines), an inhomogeneous point process (green lines), and a homogeneous point process (blue lines). The 95% confidence intervals represent the variability across images.
Figure 10
 
Summed deviation from complete spatial randomness (Equation 6) of empirical and simulated point processes for all conditions (natural vs. texture images; first vs. second inspection). The deviation was computed for distances \(0.1{\rm{^\circ }} \lt r \lt 6.5{\rm{^\circ }}\). As expected, the deviation from complete spatial randomness is smallest for the two control processes (homogeneous and inhomogeneous point process).
Figure 10
 
Summed deviation from complete spatial randomness (Equation 6) of empirical and simulated point processes for all conditions (natural vs. texture images; first vs. second inspection). The deviation was computed for distances \(0.1{\rm{^\circ }} \lt r \lt 6.5{\rm{^\circ }}\). As expected, the deviation from complete spatial randomness is smallest for the two control processes (homogeneous and inhomogeneous point process).
The SceneWalk model replicated this pattern of results qualitatively (Figure 9, purple lines). All conditions showed strong spatial correlations. PCFs deviated from complete spatial randomness for distances r < 6°. The effect extended to larger distances in our model simulations than in the experimental data. We analyzed deviations from complete spatial randomness with another LME for the SceneWalk model (Table 2; cf. Figure 10). PCFs deviated from complete spatial randomness (intercept). The effect was larger for the second inspection. No other fixed effect was significant for the deviation score of the SceneWalk model. 
In order to check whether spatial correlations between fixation locations are primarily generated by the tendency of participants to generate short saccade amplitudes, we simulated a control model based on the joint probability of saccade amplitudes and the distribution of fixations (see Methods). The PCF of the control model showed correlations at all evaluated distances. In direct comparison to the experimental data the control model generated considerably weaker correlations at small distances r < 3° and generated stronger correlations at large distances r > 4°. Thus, a model that is solely based on the generation of realistic saccade amplitudes and fixation densities generates a qualitatively different correlation pattern. We analyzed deviations from complete spatial randomness with another LME for the control model (Table 2; cf. Figure 10). PCFs deviated from complete spatial randomness (intercept). The effect was larger for the second inspection and on texture image. We observed no interaction of presentation and image type in the deviation score. 
Finally, the average PCFs of the inhomogeneous point process indicated only small spatial correlations, i.e., Display Formula\(\hat g(r) \approx 1\) at all distances (Figure 9, green lines). The absence of spatial correlations for the average PCF was expected, since the optimal bandwidth λ for the estimation of the PCF was chosen to minimize the deviation from complete spatial randomness of the inhomogeneous process (see Step 2 of the estimation process). PCFs of the homogeneous point process were similar to PCFs of the inhomogeneous point process, i.e., Display Formula\(\hat g(r) \approx 1\) at all distances (Figure 9, blue lines). 
Discussion
During scene perception, fixations are not uniformly distributed on an image. Instead fixations cluster in parts of an image due to bottom-up factors, top-down factors, and systematic tendencies of gaze control (Tatler & Vincent, 2008). We propose to use the PCF to investigate the relation of fixation positions within single trials and demonstrate that the PCF reveals aggregation of points at distances r < 4°—that is, it is more likely to observe fixation locations in the proximity of another fixation location than expected by the overall distribution of fixation locations. This effect cannot be explained by the tendency of participants to generate short saccade amplitudes alone, as simulations of a control model that samples fixation locations from the joint probability of the density of saccade amplitudes and the density of fixation locations led to qualitatively different PCFs. The control model underestimated aggregation at short distances and overestimated aggregation at large distances. In addition, the PCF responded sensitively to a memory manipulation in our experiment and revealed stronger aggregation of points during the second inspection than during the first inspection of an image. Simulations of the SceneWalk model (Engbert et al., 2015) demonstrated that a reduced attentional span could lead to reduced saccade amplitudes and explain the observed results of the PCF during the second inspection. 
Pair correlation function
Research on eye movements during scene perception has focused on the distribution of fixation locations across observers, as for example, during the evaluation of saliency models (Bylinskii et al., 2015). However, this approach neglects dependencies between fixation locations during a trial (Engbert et al., 2015). The PCF is a method from spatial statistics to evaluate the relation of pairs of points (Diggle, 2013) and reveals whether points solely depend on the inhomogeneity of a point process or whether pairs of points affect each other mutually at a given distance r. The PCF can be applied to eye movement data (i.e., fixation locations) in three steps. In a first step, inhomogeneous and homogeneous point processes need to be simulated to evaluate the PCF estimation. Both point processes generate fixation locations that are independent at all distances r. Hence, PCFs of both processes are expected to show no spatial correlations at any distance r. During a second step an optimal bandwidth needs to be chosen for the estimation of the PCF. As a criterion we suggest using a bandwidth for which the PCF of the simulated inhomogeneous point process has the least deviation from complete spatial randomness, i.e., shows no spatial correlations. In a last step, the computed bandwidth is used to compute PCFs for each individual trial. 
In scene perception, the PCF characterizes whether fixation locations can be explained by the underlying distribution of all fixation locations (no spatial correlations between pairs of points) or whether fixation positions interact with each other at distance r. PCFs revealed spatial correlations of fixation locations in all conditions in our experimental data. Fixations were more abundant at distances r < 4° than we would expect from the inhomogeneity of fixation locations alone. Beyond 4° fixation locations were independent of each other. Thus, observing a fixation increased the probability of observing more fixations than expected by the overall inhomogeneity within 4°. Beyond 4° fixations were as likely as predicted by the local intensity of fixation locations. As expected, neither the inhomogeneous nor the homogeneous point process revealed strong spatial correlations, since fixation locations were independent of each other for these point processes. Therefore, aggregation observed in our PCFs is generated by the empirical data and is not the result of the method itself. 
Finally, the PCF provides a quantitative statistic of spatial correlations and can be used to compare data sets generated by different processes or under different conditions. The data sets may even differ in the overall inhomogeneity of the fixation location distribution, since the inhomogeneous PCF takes this inhomogeneity into account. Hence, we were able to compare point patterns of the first and second inspection of an image, point patterns on different image types, as well as empirical and simulated point patterns. Even though the distribution of fixation locations and the distribution of saccade amplitudes were similar among these data sets, the PCF revealed considerable differences in the correlation patterns. For example, the experimental data, simulations of the SceneWalk model and simulations of the joint probability control model unveiled strong differences in spatial correlations. While the SceneWalk model generated stronger but qualitatively similar spatial correlations when compared with the experimental data, the joint probability control model produced a different pattern. Fixation locations aggregated at all evaluated distances, but produced much weaker correlations for short distances. Thus, short saccade amplitudes are not sufficient to produce the strong correlations at distances r < 2°. There must be additional mechanisms that let participants fixate the same locations in a scanpath as through direct regressions (cf. facilitation of return; Smith & Henderson, 2009) and through reinspections later during a trial. 
Interestingly, image type did not affect spatial correlations. However, we only tested two types of images, and a broader range of image types is needed to see the generalizability of this result. To conclude, the PCF is a powerful tool to compare spatial correlations of point patterns and will help to understand the eye-movement dynamics under different experimental conditions as well as the dynamics of different models of eye-movement control during scene perception. 
Repeated presentation of images
To test the sensitivity of the PCF, we recorded eye movements of participants while viewing an image twice. The repeated presentation was expected to result in shorter saccade amplitudes due to a reduced attentional span (Kaspar & König, 2011a, 2011b). We replicated shorter saccade amplitudes during the second inspection independent of image type in the experimental data. In addition, we observed stronger spatial correlations within 4° during the second inspection. 
Reduced saccade amplitudes as well as increased aggregation might be generated by a reduced attentional window (Kaspar & König, 2011a, 2011b). We tested this hypothesis with simulations of the SceneWalk model (Engbert et al., 2015). Parameter estimation led to a reduced attentional span in the SceneWalk model during the second inspection of images which in turn led to shorter saccade amplitudes and stronger aggregation of fixation locations as quantified by the PCFs. Hence, our simulation results are in agreement with the interpretation of a reduced attentional span during repeated inspection of images. 
In its current form the SceneWalk model overestimates aggregation in particular during the second inspection. However, our results are predictions and are not optimized to account for the experimentally observed PCFs. Beside the size of the aggregation, we observed a difference in the functional form of the PCF. PCFs of the SceneWalk model did not decrease as fast as those estimated from experimental data. This might have resulted from the Gaussian form of our attentional window. A revision of the model using a modified attentional window might improve model fit in this respect (cf. Schütt et al., 2017). 
Relation to other metrics
Several measures can be used to study the dynamics of scanpaths. The measure that is most strongly related to the PCF is based on Voronoi diagrams (Over, Hooge, & Erkelens, 2006). The suggested method provides a measure for the uniformity of a pattern of fixation locations and is normalized by the number of fixation locations. Hence, the measure can be used to compare the uniformity of different data sets. While the Voronoi method allows for estimation of the inhomogeneity of a point pattern, the inhomogeneous PCF takes this inhomogeneity into account and reveals spatial correlations between points that cannot be explained by the inhomogeneity. Most importantly, the inhomogeneous PCF can be used to compare point patterns with different levels of inhomogeneity. Thus, the Voronoi method and the PCF complement each other. 
A number of scanpath comparison methods have been proposed that provide metrics to describe the similarity of the dynamics of two scanpaths (for a review see Anderson, Anderson, Kingstone, & Bischof, 2015). Each metric quantifies unique aspects of the scanpaths and helps to understand which aspects resemble each other or differ between two scanpaths. However, the focus of all methods is the comparison of scanpaths and requires matching scanpaths. In contrast, the PCF does not require pairs of scanpaths as each scanpath is evaluated by its own. This allows to compare arbitrary scanpaths. In addition, PCF and scanpath comparison metrics might come to very different conclusions. Since the PCF describes spatial correlation of points, two completely different point patterns might produce the same correlations. At the same time, two seemingly similar point patterns might result in very different spatial correlations. 
Other approaches have described a large variety of individual effects that influence eye movement behavior during scene perception (e.g., Smith & Henderson, 2009; Tatler, 2007; Tatler & Vincent, 2008), and best practices have been suggested to estimate appropriate baselines for some of them, as for example, for the central fixation bias (Clarke, Stainer, Tatler, & Hunt, 2017; Clarke & Tatler, 2014). Each individual effect describes parts of the dynamics present in eye movements, but none is a sufficient metric to capture the overall complexity. Since the PCF takes all fixations of a sequence into account, it uncovers the consequences of these effects that remain unnoticed on shorter time scales. Hence, the PCF acts in concert with these other methods to understand eye movements during scene perception. 
Conclusions
The PCF is a powerful tool to analyze spatial correlations of fixation locations. During scene perception the PCF reveals aggregation of fixations during individual trials and reacts differentially to experimental manipulations. Simulations of a computational model demonstrate that a reduced attentional span leads to increased aggregation of fixation locations. Our work provides an example how spatial statistics and computational modeling can be combined to investigate general statistical properties of eye movement control. 
Acknowledgments
This work was funded by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation, grant EN 471/13–1 to R.E.) and the Collaborative Research Centre SFB 1294, project number B05 and by Bundesministerium für Bildung und Forschung (BMBF) via Bernstein Center for Computational Neuroscience Berlin (Project B3, Förderkennzeichen 01GQ1001F to R.E. & F.A.W.). Simon Barthelme was supported by an ANR grant (GenGP, ANR-16-CE23-0008). 
Commercial relationships: none. 
Corresponding author: Hans A. Trukenbrod. 
Address: Department of Psychology, University of Potsdam, Potsdam, Germany. 
References
Anderson, N. C., Anderson, F., Kingstone, A., & Bischof, W. F. (2015). A comparison of scanpath comparison methods. Behavior Research Methods, 47 (4), 1377–1392, https://doi.org/10.3758/s13428-014-0550-3.
Baayen, R., Davidson, D., & Bates, D. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59 (4), 390–412, https://doi.org/10.1016/j.jml.2007.12.005.
Bach, M. (1996). The Freiburg visual acuity test-automatic measurement of visual acuity. Optometry & Vision Science, 73 (1), 49–53.
Baddeley, A., Rubak, E., & Turner, R. (2015). Spatial point patterns: Methodology and applications with R. Boca Raton, FL: CRC Press. https://doi.org/10.18637/jss.v075.b02
Baddeley, A., & Turner, R. (2005). spatstat: An R package for analyzing spatial point patterns. Journal of Statistical Software, 12 (i06), 1–42.
Ballard, D. H., Hayhoe, M. M., & Rao, R. P. N. (1997). Deictic codes for the embodiment of cognition. Behavioral & Brain Sciences, 20 (4), 723–767.
Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68 (3), 255–278, https://doi.org/10.1016/j.jml.2012.11.001.
Barthelmé, S., Trukenbrod, H. A., Engbert, R., & Wichmann, F. A. (2013). Modeling fixation locations using spatial point processes. Journal of Vision, 13 (12): 1, 1–34, https://doi.org/10.1167/13.12.1. [PubMed] [Article]
Bates, D., Kliegl, R., Vasishth, S., & Baayen, H. (2015). Parsimonious mixed models. Retrieved from http://arxiv.org/abs/1506.04967v2.
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67 (1), 1–48, https://doi.org/10.18637/jss.v067.i01.
Bays, P. M., & Husain, M. (2012). Active inhibition and memory promote exploration and search of natural scenes. Journal of Vision, 12 (8): 8, 1–8, https://doi.org/10.1167/12.8.8. [PubMed] [Article]
Brady, T. F., Konkle, T., Alvarez, G. A., & Oliva, A. (2008). Visual long-term memory has a massive storage capacity for object details. Proceedings of the National Academy of Sciences, 105 (38), 14325–14329.
Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision, 10, 443–446.
Bruce, N. D. B., & Tsotsos, J. K. (2009). Saliency, attention, and visual search: An information theoretic approach. Journal of Vision, 9 (3): 5, 1–24, https://doi.org/10.1167/9.3.5. [PubMed] [Article]
Bylinskii, Z., Judd, T., Borji, A., Itti, L., Durand, F., Oliva, A., & Torralba, A. (2015). MIT saliency benchmark. Retrieved from http://saliency.mit.edu/
Castelhano, M. S., Mack, M. L., & Henderson, J. M. (2009). Viewing task influences eye movement control during active scene perception. Journal of Vision, 9 (3): 6, 1–15, https://doi.org/10.1167/9.3.6. [PubMed] [Article]
Cerf, M., Harel, J., Einhäuser, W., & Koch, C. (2007). Predicting human gaze using low-level saliency combined with face detection. Advances in Neural Information Processing Systems, 20, 241–248.
Clarke, A. D. F., Stainer, M. J., Tatler, B. W., & Hunt, A. R. (2017). The saccadic flow baseline: Accounting for image-independent biases in fixation behavior. Journal of Vision, 17 (11): 12, 1–19, https://doi.org/10.1167/17.11.12. [PubMed] [Article]
Clarke, A. D. F., & Tatler, B. W. (2014). Deriving an appropriate baseline for describing fixation behaviour. Vision Research, 102, 41–51, https://doi.org/10.1016/j.visres.2014.06.016.
Cornelissen, F. W., Peters, E., & Palmer, J. (2002). The Eyelink Toolbox: Eye tracking with MATLAB and the Psychophysics Toolbox. Behavior Research Methods, Instruments & Computers, 34, 613–617.
Diggle, P. J. (2013). Statistical analysis of spatial and spatio-temporal point patterns. Boca Raton, FL: CRC Press.
Einhäuser, W., Spain, M., & Perona, P. (2008). Objects predict fixations better than early saliency. Journal of Vision, 8 (14): 18, 1–26, https://doi.org/10.1167/8.14.18. [PubMed] [Article]
Engbert, R., & Kliegl, R. (2003). Microsaccades uncover the orientation of covert attention. Vision Research, 43, 1035–1045.
Engbert, R., & Mergenthaler, K. (2006). Microsaccades are triggered by low level retinal image slip. Proceedings of the National Academy of Sciences, USA, 103, 7192–7197.
Engbert, R., Trukenbrod, H. A., Barthelmé, S., & Wichmann, F. A. (2015). Spatial statistics and attentional dynamics in scene viewing. Journal of Vision, 15 (1): 14, 1–17, https://doi.org/10.1167/15.1.14. [PubMed] [Article]
Foulsham, T., & Kingstone, A. (2010). Asymmetries in the direction of saccades during perception of scenes and fractals: Effects of image type and image features. Vision Research, 50 (8), 779–795.
Foulsham, T., Kingstone, A., & Underwood, G. (2008). Turning the world around: Patterns in saccade direction vary with picture orientation. Vision Research, 48 (17), 1777–1790.
Hayhoe, M., & Ballard, D. (2005). Eye movements in natural behavior. Trends in Cognitive Sciences, 9 (4), 188–194.
Hollingworth, A., & Henderson, J. M. (2002). Accurate visual memory for previously attended objects in natural scenes. Journal of Experimental Psychology: Human Perception & Performance, 28 (1), 113–136.
Hollingworth, A., Williams, C. C., & Henderson, J. M. (2001). To see and remember: Visually specific information is retained in memory from previously attended objects in natural scenes. Psychonomic Bulletin & Review, 8 (4), 761–768.
Hooge, I. T. C., Over, E. A. B., van Wezel, R. J. A., & Frens, M. A. (2005). Inhibition of return is not a foraging facilitator in saccadic search and free viewing. Vision Research, 45 (14), 1901–1908.
Itti, L., & Koch, C. (2001). Computational modelling of visual attention. Nature Reviews Neuroscience, 2 (3), 194–203.
Judd, T., Ehinger, K., Durand, F., & Torralba, A. (2009). Learning to predict where humans look. In IEEE 12th International Conference on Computer Vision, 2009 ( pp. 2106–2113). Kyoto, Japan: IEEE. https://doi.org/10.1109/ICCV.2009.5459462
Kaspar, K., & Konig, P. (2011a). Overt attention and context factors: The impact of repeated presentations, image type, and individual motivation. PLoS One, 6 (7), e21719.
Kaspar, K., & Konig, P. (2011b). Viewing behavior and the impact of low-level image properties across repeated presentations of complex scenes. Journal of Vision, 11 (13): 26, 1–29, https://doi.org/10.1167/11.13.26. [PubMed] [Article]
Kienzle, W., Franz, M. O., Schölkopf, B., & Wichmann, F. A. (2009). Center-surround patterns emerge as optimal. Journal of Vision, 9 (5): 7, 1–15, https://doi.org/10.1167/9.5.7. [PubMed] [Article]
Kleiner, M., Brainard, D., Pelli, D., Ingling, A., Murray, R., & Broussard, C. (2007). What's new in Psychtoolbox-3. Perception, 36 (ECVP Abstract Supplement).
Klein, R. M., & MacInnes, W. J. (1999). Inhibition of return is a foraging facilitator in visual search. Psychological Science, 10 (4), 346–352.
Koch, C., & Ullman, S. (1985). Shifts in visual attention: Towards the underlying circuitry. Human Neurobiology, 4, 219–222.
Konkle, T., Brady, T. F., Alvarez, G. A., & Oliva, A. (2010). Scene memory is more detailed than you think the role of categories in visual long-term memory. Psychological Science, 21 (11), 1551–1556.
Kümmerer, M., Wallis, T. S. A., & Bethge, M. (2016). Deepgaze II: Reading fixations from deep features trained on object recognition. CoRR, abs/1610.01563. Retrieved from http://arxiv.org/abs/1610.01563
Land, M. F., & Hayhoe, M. (2001). In what ways do eye movements contribute to everyday activities? Vision Research, 41 (25–26), 3559–3565.
Law, R., Illian, J., Burslem, D. F. R. P., Gratzer, G., Gunatilleke, C. V. S., & Gunatilleke, I. A. U. N. (2009). Ecological information from spatial patterns of plants: Insights from point process theory. Journal of Ecology, 97 (4), 616–628.
Le Meur, L., & Liu, Z. (2015). Saccadic model of eye movements for free-viewing condition. Vision Research, 116, 152–164.
Mannan, S. K., Ruddock, K. H., & Wooding, D. S. (1997). Fixation patterns made during brief examination of two-dimensional images. Perception, 26 (8), 1059–1072.
Melcher, D. (2001, July 26). Persistence of visual memory for scenes. Nature, 412 (6845), 401.
Melcher, D., & Kowler, E. (2001). Visual scene memory and the guidance of saccadic eye movements. Vision Research, 41 (25–26), 3597–3611.
Mitchell, M. (1998). An introduction to genetic algorithms. Cambridge, MA: MIT Press.
Nuthmann, A., & Henderson, J. M. (2010). Object-based attentional selection in scene viewing. Journal of Vision, 10 (8): 20, 1–19, https://doi.org/10.1167/10.8.20. [PubMed] [Article]
Over, E. A. B., Hooge, I. T. C., & Erkelens, C. J. (2006). A quantitative measure for the uniformity of fixation density: The Voronoi method. Behavior Research Methods, 38 (2), 251–261. https://doi.org/10.3758/BF03192777
Parkhurst, D., Law, K., & Niebur, E. (2002). Modeling the role of salience in the allocation of overt visual attention. Vision Research, 42 (1), 107–123.
Pelli, D. G. (1997). The videotoolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10, 437–442.
Pelz, J. B., & Canosa, R. (2001). Oculomotor behavior and perceptual strategies in complex tasks. Vision Research, 41 (25–26), 3587–3596.
R Core Team. (2018). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.
Reinagel, P., & Zador, A. M. (1999). Natural scene statistics at the centre of gaze. Network: Computation in Neural Systems, 10, 341–350.
Rothkegel, L. O., Trukenbrod, H. A., Schütt, H. H., Wichmann, F. A., & Engbert, R. (2016). Influence of initial fixation position in scene viewing. Vision Research, 129, 33–49. https://doi.org/10.1016/j.visres.2016.09.012
Rothkegel, L. O., Trukenbrod, H. A., Schutt, H. H., Wichmann, F. A., & Engbert, R. (2017). Temporal evolution of the central fixation bias in scene viewing. Journal of Vision, 17 (13): 3, 1–18, https://doi.org/10.1167/17.13.3. [PubMed] [Article]
Schütt, H. H., Rothkegel, L., Trukenbrod, H. A., Engbert, R., & Wichmann, F. A. (2019). Disentangling bottom-up versus top-down and low-level versus high-level influences on eye movements over time. Journal of Vision, 19 (3): 1, 1–23, https://doi.org/10.1167/19.3.1. [PubMed] [Article].
Schütt, H. H., Rothkegel, L. O., Trukenbrod, H. A., Reich, S., Wichmann, F. A., & Engbert, R. (2017). Likelihood-based parameter estimation and comparison of dynamical cognitive models. Psychological Review, 124 (4), 505–524.
Smith, T. J., & Henderson, J. M. (2009). Facilitation of return during scene viewing. Visual Cognition, 17 (6–7), 1083–1108.
Standing, L., Conezio, J., & Haber, R. N. (1970). Perception and memory for pictures: Single-trial learning of 2500 visual stimuli. Psychonomic Science, 19 (2), 73–74.
Stensola, H., Stensola, T., Solstad, T., Frøland, K., Moser, M.-B., & Moser, E. I. (2012, December 6). The entorhinal grid map is discretized. Nature, 492 (7427), 72–78.
Tatler, B. W. (2007). The central fixation bias in scene viewing: Selecting an optimal viewing position independently of motor biases and image feature distributions. Journal of Vision, 7 (14): 4, 1–17, https://doi.org/10.1167/7.14.4. [PubMed] [Article]
Tatler, B. W., Baddeley, R. J., & Gilchrist, I. D. (2005). Visual correlates of fixation selection: Effects of scale and time. Vision Research, 45 (5), 643–659.
Tatler, B. W., & Vincent, B. T. (2008). Systematic tendencies in scene viewing. Journal of Eye Movement Research, 2 (2), 1–18.
Torralba, A., Oliva, A., Castelhano, M. S., & Henderson, J. M. (2006). Contextual guidance of eye movements and attention in real-world scenes: The role of global features in object search. Psychological Review, 113 (4), 766–786.
Vig, E., Dorr, M., & Cox, D. (2014). Large-scale optimization of hierarchical features for saliency prediction in natural images. 2014 IEEE Conference on Computer Vision and Pattern Recognition ( pp. 2798–2805). Columbus, OH: IEEE. https://doi.org/10.1109/CVPR.2014.358
Wickham, H. (2009). ggplot2: Elegant graphics for data analysis. New York, NY: Springer. Retrieved from http://had.co.nz/ggplot2/book
Wilming, N., Harst, S., Schmidt, N., & König, P. (2013). Saccadic momentum and facilitation of return saccades contribute to an optimal foraging strategy. PLoS Computational Biology, 9 (1), e1002871.
Yarbus, A. L. (1967). Eye movements and vision. New York, NY: Plenum Press.
Zhang, L., Tong, M. H., Marks, T. K., Shan, H., & Cottrell, G. W. (2008). SUN: A Bayesian framework for saliency using natural statistics. Journal of Vision, 8 (7): 32, 1–20, https://doi.org/10.1167/8.7.32. [PubMed] [Article]
Footnotes
1  The Epanechikov kernel Display Formula\(\epsilon (x) = {3 \over {4w}}{(1 - {{{x^2}} \over {{w^2}}})_ + }\) with (x)+ = max(0,x) is a quadratic function that is truncated to the interval [−w, w].
Figure 1
 
Examples of point patterns (upper panels): regular, random, and aggregated (clustered). Corresponding PCFs (lower panels).
Figure 1
 
Examples of point patterns (upper panels): regular, random, and aggregated (clustered). Corresponding PCFs (lower panels).
Figure 2
 
Distribution of fixation locations. Dots represent fixation locations; estimated intensities of each point pattern are illustrated by gray shading. Experimental fixation locations (red, left panel), locations generated by an inhomogeneous point process (green, central panel), and a homogeneous point process (blue, right panel).
Figure 2
 
Distribution of fixation locations. Dots represent fixation locations; estimated intensities of each point pattern are illustrated by gray shading. Experimental fixation locations (red, left panel), locations generated by an inhomogeneous point process (green, central panel), and a homogeneous point process (blue, right panel).
Figure 3
 
Three representative scanpaths during viewing of an image. Each row represents the scanpath during a single trial of the empirical data (red, left panel), the simulated inhomogeneous point process (green, central panel), and the simulated homogeneous point process (blue, right panel).
Figure 3
 
Three representative scanpaths during viewing of an image. Each row represents the scanpath during a single trial of the empirical data (red, left panel), the simulated inhomogeneous point process (green, central panel), and the simulated homogeneous point process (blue, right panel).
Figure 4
 
Optimal bandwidth for intensity estimation of PCF. The deviation from complete spatial randomness (cf. Equation 6) is evaluated for bandwidths between 0.1° and 10° in steps of 0.1° for each image.
Figure 4
 
Optimal bandwidth for intensity estimation of PCF. The deviation from complete spatial randomness (cf. Equation 6) is evaluated for bandwidths between 0.1° and 10° in steps of 0.1° for each image.
Figure 5
 
PCFs of individual scanpaths (gray lines) and the average PCF on an image for the experimental data (red, left panel), inhomogeneous point process (green, central panel), and homogeneous point process (blue, right, panel). PCFs of trials from Figure 3 are displayed in black.
Figure 5
 
PCFs of individual scanpaths (gray lines) and the average PCF on an image for the experimental data (red, left panel), inhomogeneous point process (green, central panel), and homogeneous point process (blue, right, panel). PCFs of trials from Figure 3 are displayed in black.
Figure 6
 
PCFs of individual images. PCFs were estimated for each condition separately (natural vs. texture images; first vs. second inspection). Estimated PCFs of the empirical fixation locations (red lines), simulated locations generated by an inhomogeneous point process (green lines), and an homogeneous point process (blue lines).
Figure 6
 
PCFs of individual images. PCFs were estimated for each condition separately (natural vs. texture images; first vs. second inspection). Estimated PCFs of the empirical fixation locations (red lines), simulated locations generated by an inhomogeneous point process (green lines), and an homogeneous point process (blue lines).
Figure 7
 
Images used in experiment. Left: natural scenes; right: texture images.
Figure 7
 
Images used in experiment. Left: natural scenes; right: texture images.
Figure 8
 
Saccade amplitudes. Experiment (red), and results of model simulations by the SceneWalk model (purple), a joint probability model of saccade amplitudes and fixation density (brown), an inhomogeneous point process (green), and a homogeneous point process (blue). Note, scales of the x-axis differ between the first three and the last two point processes.
Figure 8
 
Saccade amplitudes. Experiment (red), and results of model simulations by the SceneWalk model (purple), a joint probability model of saccade amplitudes and fixation density (brown), an inhomogeneous point process (green), and a homogeneous point process (blue). Note, scales of the x-axis differ between the first three and the last two point processes.
Figure 9
 
Average PCFs for experimental and simulated point processes. PCFs were estimated for each condition separately (natural vs. texture images: top vs. bottom row; first vs. second inspection: solid vs. dashed lines). Estimated PCFs for the observed fixation locations (red lines), and simulated locations generated by the SceneWalk model (purple lines), a joint probability model of saccade amplitudes and fixation density (brown lines), an inhomogeneous point process (green lines), and a homogeneous point process (blue lines). The 95% confidence intervals represent the variability across images.
Figure 9
 
Average PCFs for experimental and simulated point processes. PCFs were estimated for each condition separately (natural vs. texture images: top vs. bottom row; first vs. second inspection: solid vs. dashed lines). Estimated PCFs for the observed fixation locations (red lines), and simulated locations generated by the SceneWalk model (purple lines), a joint probability model of saccade amplitudes and fixation density (brown lines), an inhomogeneous point process (green lines), and a homogeneous point process (blue lines). The 95% confidence intervals represent the variability across images.
Figure 10
 
Summed deviation from complete spatial randomness (Equation 6) of empirical and simulated point processes for all conditions (natural vs. texture images; first vs. second inspection). The deviation was computed for distances \(0.1{\rm{^\circ }} \lt r \lt 6.5{\rm{^\circ }}\). As expected, the deviation from complete spatial randomness is smallest for the two control processes (homogeneous and inhomogeneous point process).
Figure 10
 
Summed deviation from complete spatial randomness (Equation 6) of empirical and simulated point processes for all conditions (natural vs. texture images; first vs. second inspection). The deviation was computed for distances \(0.1{\rm{^\circ }} \lt r \lt 6.5{\rm{^\circ }}\). As expected, the deviation from complete spatial randomness is smallest for the two control processes (homogeneous and inhomogeneous point process).
Table 1
 
Model parameters. Note: Standard errors were calculated from five parameter estimations of the experimental data recorded during the first presentation of natural images.
Table 1
 
Model parameters. Note: Standard errors were calculated from five parameter estimations of the experimental data recorded during the first presentation of natural images.
Table 2
 
Fixed effects of linear mixed effect models. For each point process we estimated separate models for saccade amplitudes (log-transformed) and the summed deviation from complete spatial randomness of the PCF (log-transformed). The table reports estimates of fixed effects (β) with standard errors (SE) and t values. |t| > 2 are interpreted as significant effects.
Table 2
 
Fixed effects of linear mixed effect models. For each point process we estimated separate models for saccade amplitudes (log-transformed) and the summed deviation from complete spatial randomness of the PCF (log-transformed). The table reports estimates of fixed effects (β) with standard errors (SE) and t values. |t| > 2 are interpreted as significant effects.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×