**Scene viewing is used to study attentional selection in complex but still controlled environments. One of the main observations on eye movements during scene viewing is the inhomogeneous distribution of fixation locations: While some parts of an image are fixated by almost all observers and are inspected repeatedly by the same observer, other image parts remain unfixated by observers even after long exploration intervals. Here, we apply spatial point process methods to investigate the relationship between pairs of fixations. More precisely, we use the pair correlation function, a powerful statistical tool, to evaluate dependencies between fixation locations along individual scanpaths. We demonstrate that aggregation of fixation locations within 4° is stronger than expected from chance. Furthermore, the pair correlation function reveals stronger aggregation of fixations when the same image is presented a second time. We use simulations of a dynamical model to show that a narrower spatial attentional span may explain differences in pair correlations between the first and the second inspection of the same image.**

*λ*(

*x*). In the case of eye movements the intensity represents the local average spatial density of fixations at a location

*x*. Point patterns that are generated by a homogeneous point process are uniformly distributed and the underlying intensity

*λ*(

*x*) =

*λ*is constant for all

*x*. For inhomogeneous point processes, where the two-dimensional (2-D) density of fixation locations is non-uniformly distributed, the intensity

*λ*(

*x*) is estimated for each location

*x*separately, i.e.,

*E*[

*N*] falling into a disc of infinitesimal size |

*dx*|.

*ρ*(

*r*). The pair density (or second-order intensity function) describes the probability of simultaneously observing points generated by a point process in two discs with centers

*x*and

*y*of infinitesimal size

*dx*and

*dy*,

*ρ*(

*x, y*) =

*ρ*(

*r*) depends on the distances of pairs of points only, where

*r*corresponds to the distance between pairs of points ||

*x*–

*y|*| (Diggle, 2013). Mathematically we estimate the pair density

*r*by

*k*represents an Epanechnikov kernel

^{1}(Baddeley, Rubak, & Turner, 2015) and

*x*,

_{i}*x*) with the reciprocal of the fraction of the window area, in which a first point

_{j}*x*could be placed, so that both points

_{i}*x*,

_{i}*x*would be observable (Baddeley et al., 2015).

_{j}*λ*(

*x*) of an inhomogeneous point process at each fixation location,

_{i}*g*(

*r*) ≥ 0, at all distances

*r*. Values of the PCF close to one,

*g*(

*r*) ≈ 1, indicate that pairs of points at distance

*r*are independent. Points at distance

*r*occur solely due to the underlying intensity

*λ*(

*x*). For larger values, i.e.,

*g*(

*r*) > 1, point patterns are more abundant at distance

*r*than expected by the intensity

*λ*(

*x*). Thus, pairs of points at distance

*r*interact and observing a point

*x*increases the probability of observing a point

*y*at distance

*r*. The probability of observing point

*y*is higher than predicted by the local intensity

*λ*(

*y*). Conversely, smaller values,

*g*(

*r*) < 1, reveal that points are less abundant than the spatial average at distance

*r*. Observing a point reduces the probability of observing a second point at distance

*r*.

*λ*(

*x*) =

*λ*(cf. Equation 4) since deviations from uniformity are easier to interpret visually. The same interpretation, however, can be applied to inhomogeneous PCFs. The first example shows a regular point pattern (left column). Visual inspection of the points indicate a grid-like arrangement. The distance between neighboring points is relatively constant. The resulting PCF (bottom row) summarizes this behavior. At short distances,

*r*< 4, the PCF reveals a strong inhibitory effect,

*g*(

*r*) ≈ 0. The existence of a point impedes the occurrence of other points within this radius. At medium distances, 0 <

*r*< 6, the PCF reveals aggregation of points,

*g*(

*r*) > 1. Observing a point boosts the occurrence of points at this distance. Hence, the grid-like appearance. At larger distances,

*r*> 6, the PCF lends support to the hypothesis of independence of points, since

*g*(

*r*) ≈ 1. We observe no long-range interaction of pairs of points and the distribution of distances can be explained by the density distribution,

*λ*(

*x*).

*g*(

*r*) ≈ 1. Note, the aggregation at short distances

*r*is an artifact generated by the estimation process. Finally, the third example illustrates an aggregated point process (right panels). The PCF at short distances,

*r*< 2, reveals aggregation,

*g*(

*r*) > 1, while the PCF at longer distances reveals independence,

*g*(

*r*) ≈ 1. Thus, observing a point increases the likelihood of observing other points in close proximity. The occurrence of distant points can be explained by the uniform distribution.

*R*using the

*spatstat*(Baddeley & Turner, 2005; Baddeley et al., 2015) and

*ggplot2*packages (Wickham, 2009). R-code of our analysis can be found here: http://www.rpubs.com/Hans/PCF.

*r*. Any observed correlations would be spurious and depend on the data structure (e.g., length of fixation sequences) or a wrong parameterization of the method. Hence, both control processes ensure that correlations in the PCF arise from the empirical data and not by the method itself. In addition, the inhomogeneous point process is used in the second step to estimate an optimal bandwidth for the intensity estimation of the PCF.

*R*function: bw.scott) from the

*spatstat*package (Baddeley & Turner, 2005) and estimated an optimal bandwidth for each image. The estimated intensity

*r*. We computed the deviation from complete spatial randomness (Equation 6) for PCFs computed with different bandwidths. We varied bandwidths from 0° and 10° in steps of 0.1° and computed the deviation from complete spatial randomness for each scanpath. The average deviation at each bandwidth is plotted in Figure 4. Lines represent individual images. For all images, the deviation increases for small bandwidths and large bandwidths with an optimal bandwidth between 1.5° and 5°. The bandwidth yielding the smallest deviation was chosen for the intensity estimation of the PCF.

*r*< 4° in all conditions. For a more detailed discussion see the Results section.

*M*= 24.0). Participants received study credits or 8€ for participation and were recruited at the University of Postdam and from a local school (32 students from the University of Potsdam, three pupils from a local high-school). All participants had normal or corrected-to-normal vision as assessed by the Freiburg Vision Test (Bach, 1996).

^{2}The study conformed to the national ethics guidelines. We obtained written informed consent from all participants.

*SD*s for at least 6 ms with a minimum amplitude of 0.5°. Eye traces between successive saccades were tagged as fixations. Fixation positions were computed by averaging the mean eye position of both eyes. Since trials started with a fixation check, first fixations on images were removed from the data set (

*N*= 2,100). In addition, fixations containing a blink or with a blink during an adjacent saccade were excluded from subsequent analyses (

*N*= 2,214). Overall, 55.526 fixations remained for further analyses.

*k*×

*l*= 128 × 128 (Stensola et al., 2012). Activations in the attention map

*a*at coordinates (

_{ij}*i, j*) evolve over time. To approximate potential saccade targets we compute the empirical density of fixation locations on a given image. The empirical density contains all effects generated by bottom-up and top-down processing that contribute to the inhomogeneous distribution of fixation locations. In our model simulations, the empirical density feeds into the attention map. Extraction from the empirical density is highest at fixation and decreases with increasing eccentricity. This corresponds to the attentional window of our model. Mathematically, the empirical density of fixation locations is weighted by a Gaussian envelope of size

*σ*

_{1}. Position of the attentional window

*a*changes after each saccade and remains constant otherwise. In addition leakage leads to a temporal decay of activations. The updating rule of the attention map

_{ij}*a*is given by

_{ij}*ρ*and a normalized point-wise product of a 2-D Gaussian

*A*(

_{ij}*t*) centered upon fixation at time

*t*and the distribution of fixation locations

*ϕ*.

_{ij}*f*is very similar to the dynamics in the attention map

_{ij}*a*. Temporal evolution consists of activation accumulation centered at fixation and a proportional temporal decay across the map. The updating rule for the fixation map

_{ij}*f*is given by

_{ij}*F*(

_{ij}*t*) with standard deviation

*σ*

_{0}centered at fixation at time

*t*and a decay rate of the fixation map

*ω*. The fixation map tracks fixated areas and is motivated by inhibition of return (Klein & MacInnes, 1999), which has been suggested as a mechanism to drive exploration in scenes (Itti & Koch, 2001). Although the role of inhibition of return has been questioned (Smith & Henderson, 2009), model simulations support inhibitory tagging as an important mechansim during scene perception (Rothkegel et al., 2016; cf. Bays & Husain, 2012).

*u*is given by

_{ij}*λ*and

*γ*are free parameters. Engbert et al. (2015) fixed these parameters to

*λ*= 1 to reproduce the densities of gaze positions and

*γ*= 0.3 to reproduce spatial correlations between fixation locations. We kept these values in our simulations. The probability of a location (

*i, j*) to be chosen as the next saccade target can be extracted from the potential, i.e.,

*S*contains all positions on the grid with

*u*(

_{ij}*t*) ≤ 0 and a free parameter

*η*that adds noise to the selection process so that every position has at least a minimal probability to be chosen as the next saccade target. Target selection in the SceneWalk model occurs at the end of fixation where the eyes move instantaneously. The intervals between successive saccades were drawn from a Gamma distribution with a shape parameter of 9 and a scale parameter of ∼0.031, which corresponds to a mean fixation duration of 275 ms.

*σ*

_{1}and inhibition span

*σ*

_{0}for simulation of the second inspection of images. We used a genetic algorithm approach (Mitchell, 1998) to estimate model parameters. Parameter estimation was based on the first five images for each image type. The remaining 10 images in each image set were used for model evaluations. Limiting the analysis to the predicted images did not alter effects. We used first-order statistics (2-D density of fixation locations) and the distribution of saccade lengths as an objective function to evaluate parameters. A list of estimated parameter values and standard errors can be found in Table 1.

*lme4*package (Bates, Mächler, Bolker, & Walker, 2015) in

*R*(R Core Team, 2018). We

*log*-transformed both dependent variables, since they deviated considerably from normal distributions. For the statistical model of the empirical data, we used the maximal possible random effect structure (Barr, Levy, Scheepers, & Tily, 2013) and ensured that none of the models was degenerate (Bates, Kliegl, Vasishth, & Baayen, 2015). For our results we interpret all |

*t*| > 2 as significant fixed effects (Baayen, Davidson, & Bates, 2008).

*r*. The estimated PCFs deviated from complete spatial randomness, i.e.,

*r*< 4°. More importantly, the second presentation of an image (dashed lines) led to increased PCFs for both natural scenes (top row) and texture images (bottom row). Statistically, we evaluated spatial correlations by computing the deviation from complete spatial randomness of each PCF, i.e., the summed deviation of the PCF from one for distances 0.1 ≤

*r*≤ 6.5 (Figure 10; cf. Equation 6). An LME revealed a significant deviation from complete spatial randomness (intercept) and an effect of presentation (Table 2). All other fixed effects were nonsignificant. Thus, deviations from complete spatial randomness were present in all conditions of our experiment with larger deviations during the second inspection of an image irrespective of image type.

*r*< 6°. The effect extended to larger distances in our model simulations than in the experimental data. We analyzed deviations from complete spatial randomness with another LME for the SceneWalk model (Table 2; cf. Figure 10). PCFs deviated from complete spatial randomness (intercept). The effect was larger for the second inspection. No other fixed effect was significant for the deviation score of the SceneWalk model.

*r*< 3° and generated stronger correlations at large distances

*r*> 4°. Thus, a model that is solely based on the generation of realistic saccade amplitudes and fixation densities generates a qualitatively different correlation pattern. We analyzed deviations from complete spatial randomness with another LME for the control model (Table 2; cf. Figure 10). PCFs deviated from complete spatial randomness (intercept). The effect was larger for the second inspection and on texture image. We observed no interaction of presentation and image type in the deviation score.

*r*< 4°—that is, it is more likely to observe fixation locations in the proximity of another fixation location than expected by the overall distribution of fixation locations. This effect cannot be explained by the tendency of participants to generate short saccade amplitudes alone, as simulations of a control model that samples fixation locations from the joint probability of the density of saccade amplitudes and the density of fixation locations led to qualitatively different PCFs. The control model underestimated aggregation at short distances and overestimated aggregation at large distances. In addition, the PCF responded sensitively to a memory manipulation in our experiment and revealed stronger aggregation of points during the second inspection than during the first inspection of an image. Simulations of the SceneWalk model (Engbert et al., 2015) demonstrated that a reduced attentional span could lead to reduced saccade amplitudes and explain the observed results of the PCF during the second inspection.

*r*. The PCF can be applied to eye movement data (i.e., fixation locations) in three steps. In a first step, inhomogeneous and homogeneous point processes need to be simulated to evaluate the PCF estimation. Both point processes generate fixation locations that are independent at all distances

*r*. Hence, PCFs of both processes are expected to show no spatial correlations at any distance

*r*. During a second step an optimal bandwidth needs to be chosen for the estimation of the PCF. As a criterion we suggest using a bandwidth for which the PCF of the simulated inhomogeneous point process has the least deviation from complete spatial randomness, i.e., shows no spatial correlations. In a last step, the computed bandwidth is used to compute PCFs for each individual trial.

*r*. PCFs revealed spatial correlations of fixation locations in all conditions in our experimental data. Fixations were more abundant at distances

*r*< 4° than we would expect from the inhomogeneity of fixation locations alone. Beyond 4° fixation locations were independent of each other. Thus, observing a fixation increased the probability of observing more fixations than expected by the overall inhomogeneity within 4°. Beyond 4° fixations were as likely as predicted by the local intensity of fixation locations. As expected, neither the inhomogeneous nor the homogeneous point process revealed strong spatial correlations, since fixation locations were independent of each other for these point processes. Therefore, aggregation observed in our PCFs is generated by the empirical data and is not the result of the method itself.

*r*< 2°. There must be additional mechanisms that let participants fixate the same locations in a scanpath as through direct regressions (cf. facilitation of return; Smith & Henderson, 2009) and through reinspections later during a trial.

*, 47 (4), 1377–1392, https://doi.org/10.3758/s13428-014-0550-3.*

*Behavior Research Methods**, 59 (4), 390–412, https://doi.org/10.1016/j.jml.2007.12.005.*

*Journal of Memory and Language**, 73 (1), 49–53.*

*Optometry & Vision Science**. Boca Raton, FL: CRC Press. https://doi.org/10.18637/jss.v075.b02*

*Spatial point patterns: Methodology and applications with R**, 12 (i06), 1–42.*

*Journal of Statistical Software**, 20 (4), 723–767.*

*Behavioral & Brain Sciences**, 68 (3), 255–278, https://doi.org/10.1016/j.jml.2012.11.001.*

*Journal of Memory and Language**, 13 (12): 1, 1–34, https://doi.org/10.1167/13.12.1. [PubMed] [Article]*

*Journal of Vision**, 67 (1), 1–48, https://doi.org/10.18637/jss.v067.i01.*

*Journal of Statistical Software**, 12 (8): 8, 1–8, https://doi.org/10.1167/12.8.8. [PubMed] [Article]*

*Journal of Vision**, 105 (38), 14325–14329.*

*Proceedings of the National Academy of Sciences**, 10, 443–446.*

*Spatial Vision**, 9 (3): 5, 1–24, https://doi.org/10.1167/9.3.5. [PubMed] [Article]*

*Journal of Vision**. Retrieved from http://saliency.mit.edu/*

*MIT saliency benchmark**, 9 (3): 6, 1–15, https://doi.org/10.1167/9.3.6. [PubMed] [Article]*

*Journal of Vision**, 20, 241–248.*

*Advances in Neural Information Processing Systems**, 17 (11): 12, 1–19, https://doi.org/10.1167/17.11.12. [PubMed] [Article]*

*Journal of Vision**, 102, 41–51, https://doi.org/10.1016/j.visres.2014.06.016.*

*Vision Research**, 34, 613–617.*

*Behavior Research Methods, Instruments & Computers**. Boca Raton, FL: CRC Press.*

*Statistical analysis of spatial and spatio-temporal point patterns**, 8 (14): 18, 1–26, https://doi.org/10.1167/8.14.18. [PubMed] [Article]*

*Journal of Vision**, 43, 1035–1045.*

*Vision Research**, 103, 7192–7197.*

*Proceedings of the National Academy of Sciences, USA**, 15 (1): 14, 1–17, https://doi.org/10.1167/15.1.14. [PubMed] [Article]*

*Journal of Vision**, 50 (8), 779–795.*

*Vision Research**, 48 (17), 1777–1790.*

*Vision Research**, 9 (4), 188–194.*

*Trends in Cognitive Sciences**, 28 (1), 113–136.*

*Journal of Experimental Psychology: Human Perception & Performance**, 8 (4), 761–768.*

*Psychonomic Bulletin & Review**, 45 (14), 1901–1908.*

*Vision Research**, 2 (3), 194–203.*

*Nature Reviews Neuroscience**, 2009 ( pp. 2106–2113). Kyoto, Japan: IEEE. https://doi.org/10.1109/ICCV.2009.5459462*

*IEEE 12th International Conference on Computer Vision**, 6 (7), e21719.*

*PLoS One**, 11 (13): 26, 1–29, https://doi.org/10.1167/11.13.26. [PubMed] [Article]*

*Journal of Vision**, 9 (5): 7, 1–15, https://doi.org/10.1167/9.5.7. [PubMed] [Article]*

*Journal of Vision**, 36 (ECVP Abstract Supplement).*

*Perception**, 10 (4), 346–352.*

*Psychological Science**, 4, 219–222.*

*Human Neurobiology**, 21 (11), 1551–1556.*

*Psychological Science**, abs/1610.01563. Retrieved from http://arxiv.org/abs/1610.01563*

*CoRR**, 41 (25–26), 3559–3565.*

*Vision Research**, 97 (4), 616–628.*

*Journal of Ecology**, 116, 152–164.*

*Vision Research**, 26 (8), 1059–1072.*

*Perception**, 412 (6845), 401.*

*Nature**, 41 (25–26), 3597–3611.*

*Vision Research**. Cambridge, MA: MIT Press.*

*An introduction to genetic algorithms**, 10 (8): 20, 1–19, https://doi.org/10.1167/10.8.20. [PubMed] [Article]*

*Journal of Vision**, 38 (2), 251–261. https://doi.org/10.3758/BF03192777*

*Behavior Research Methods**, 42 (1), 107–123.*

*Vision Research**, 10, 437–442.*

*Spatial Vision**, 41 (25–26), 3587–3596.*

*Vision Research**. Vienna, Austria: R Foundation for Statistical Computing.*

*R: A language and environment for statistical computing**, 10, 341–350.*

*Network: Computation in Neural Systems**, 129, 33–49. https://doi.org/10.1016/j.visres.2016.09.012*

*Vision Research**, 17 (13): 3, 1–18, https://doi.org/10.1167/17.13.3. [PubMed] [Article]*

*Journal of Vision**, 19 (3): 1, 1–23, https://doi.org/10.1167/19.3.1. [PubMed] [Article].*

*Journal of Vision**, 124 (4), 505–524.*

*Psychological Review**, 17 (6–7), 1083–1108.*

*Visual Cognition**, 19 (2), 73–74.*

*Psychonomic Science**, 492 (7427), 72–78.*

*Nature**, 7 (14): 4, 1–17, https://doi.org/10.1167/7.14.4. [PubMed] [Article]*

*Journal of Vision**, 45 (5), 643–659.*

*Vision Research**, 2 (2), 1–18.*

*Journal of Eye Movement Research**, 113 (4), 766–786.*

*Psychological Review**( pp. 2798–2805). Columbus, OH: IEEE. https://doi.org/10.1109/CVPR.2014.358*

*2014 IEEE Conference on Computer Vision and Pattern Recognition**. New York, NY: Springer. Retrieved from http://had.co.nz/ggplot2/book*

*ggplot2: Elegant graphics for data analysis**, 9 (1), e1002871.*

*PLoS Computational Biology**. New York, NY: Plenum Press.*

*Eye movements and vision**, 8 (7): 32, 1–20, https://doi.org/10.1167/8.7.32. [PubMed] [Article]*

*Journal of Vision*^{1}The Epanechikov kernel

*x*)

_{+}=

*max*(0,

*x*) is a quadratic function that is truncated to the interval [−w, w].