Open Access
Article  |   January 2019
Recurrence quantification analysis of eye movements during mental imagery
Author Affiliations
Journal of Vision January 2019, Vol.19, 17. doi:10.1167/19.1.17
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Lilla M. Gurtner, Walter F. Bischof, Fred W. Mast; Recurrence quantification analysis of eye movements during mental imagery. Journal of Vision 2019;19(1):17. doi: 10.1167/19.1.17.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Several studies demonstrated similarities of eye fixations during mental imagery and visual perception but—to our knowledge—the temporal characteristics of eye movements during imagery have not yet been considered in detail. To fill this gap, the same data is analyzed with conventional spatial techniques such as analysis of areas of interest (AOI), ScanMatch, and MultiMatch and with recurrence quantification analysis (RQA), a new way of analyzing gaze data by tracking re-fixations and their temporal dynamics. Participants viewed and afterwards imagined three different kinds of pictures (art, faces, and landscapes) while their eye movements were recorded. While fixation locations during imagery were related to those during perception, participants returned more often to areas they had previously looked at during imagery and their scan paths were more clustered and more repetitive when compared to visual perception. Furthermore, refixations of the same area occurred sooner after initial fixation during mental imagery. The results highlight not only content-driven spatial similarities between imagery and perception but also shed light on the processes of mental imagery maintenance and interindividual differences in these processes.

Introduction
There is more to eye movements than simply looking at things. As early as 1935, Totten showed that we move our eyes when we think about images even though we do not have anything to look at. Computer-based eye tracking has confirmed this observation and revealed that, even in the absence of visual input, people direct their gaze towards positions in imaginary scenes where objects would be present (Brandt & Stark, 1997; Laeng & Teodorescu, 2002; Martarelli & Mast, 2013; Spivey & Geng, 2001). This similarity might not be purely epiphenomenal as there is evidence that eye movements towards areas visited during encoding are associated with successful retrieval in adults (Johansson & Johansson, 2014; Olsen, Chiew, Buchsbaum, & Ryan, 2014) and children (Martarelli & Mast, 2011). In addition to the similarity of gaze patterns, also neural activation patterns during visual mental imagery resemble those observed during visual perception (Bone et al., 2018). The cognitive dimension of eye movements toward absent objects is further underlined by the finding that the order of fixation during retrieval influences the accuracy of retrieval (Bochynska & Laeng, 2015) and that the mere planning of eye movements can interfere with performance in visuospatial working memory tasks (Postle, Idzikowski, Della Sala, Logie, & Baddeley, 2006). 
In the past, studies of similarity between eye movements during mental imagery and perception have considered mainly spatial aspects. For example, Spivey and Geng (2001) and Johansson, Holsanova, and Holmqvist (2006) examined saccade directions to see if they corresponded to directions expressed in voice recordings. Other authors compared the proportion of fixations spent in predefined areas of interest (AOI) during perception and mental imagery (Laeng & Teodorescu, 2002; Martarelli & Mast, 2013; Richardson & Spivey, 2000). Brandt and Stark (1997) split their screen into a finer grained grid than, for example, Laeng and Teodorescu (2002) and compared fixation sequences in these grids. All these methods analyze the spatial dimension of fixations or saccades, but do not describe eye movements during mental imagery in terms of their temporal regularities and dependencies. Methods addressing the temporal patterns of eye movements, however, can reveal the way mental images evolve over time, how they are generated and maintained in the first place (Kosslyn, 1994). For example, mental images could be constructed serially, where one piece at a time is generated from memory. Such a serial construction would be reflected in a gaze that remains at the same location for several consecutive fixations before moving on to the next area to be constructed. Refixations would thus happen soon or immediately after the initial fixation of an area. Alternatively, mental images could be constructed in a more holistic fashion: Several parts of the mental image could be quickly generated from memory and, once in place, they would only need to be maintained. Such maintenance would be reflected in alternating eye movements between different areas. Thus, in a holistic construction process, we expect more time to pass between refixations of the same area than in a serial construction, where refixations happen without intermediate visits to other areas. The serial and the holistic process can nevertheless lead to the same overall distribution of fixations over the screen: In the serial process, the areas are visited one after the other while in the holistic process the gaze alternates between these areas. Therefore, spatial information does not discriminate between the two processes and they can only be distinguished by considering the temporal patterns of eye movements. It is known that mental images fade over time (De Beni, Pazzaglia, & Gardini, 2007; Farah, 1989; Kosslyn, 1994), and therefore it is more plausible that eye fixations jump back and forth between parts of the image to counteract the fading of the mental image. Although the serial and holistic construction are not the only conceivable ways a mental image could be generated and maintained, they illustrate that the analysis of temporal characteristics can add to the understanding of the mental imagery process. 
The temporal information is, to a certain extent, preserved in methods analyzing fixation sequences, but there are also tools to directly analyze the temporal patterns of fixations, see Anderson, Anderson, Kingstone, and Bischof (2015), for a comprehensive review of eye movement analysis methods. Recent methods have focused on the spatial similarity of entire fixation sequences: The ScanMatch (Cristino, Mathôt, Theeuwes, & Gilchrist, 2010) and MultiMatch (Dewhurst et al., 2012) methods assess the spatial similarity of fixation sequences. ScanMatch captures the position of fixations within a grid overlaid on the screen, whereas MultiMatch compares scan path shapes more flexibly by converting them into vectors before comparing them. In contrast to these spatially oriented methods, recurrence quantification analysis (RQA; Anderson, Bischof, Laidlaw, Risko, & Kingstone, 2013) describes the proportions and temporal sequence of refixations, analyzing how often and when participants look back to areas they have already inspected. Furthermore, RQA can quantify repetition in SCAN path patterns. Together, ScanMatch and MultiMatch allow for sophisticated spatial comparisons of fixation sequences while RQA describes the temporal organization of eye movements. It is important to note that information about fixation sequences as used in ScanMatch or MultiMatch is not sufficient to capture the temporal characteristics of eye movements, because both approaches systematically underestimate the number of refixations. MultiMatch underestimates refixations because it simplifies scan path sequences by grouping nearby fixations and by uniting subsequent saccades into the same direction into one single large saccade. This leads to better overall shape comparison but systematically reduces the number of refixations to the same place. ScanMatch or Brandt and Stark's (1997) analysis suffer from the fact that nearby and thus conceptually related fixations might fall into different grid cells (Anderson et al., 2015; Dewhurst et al., 2012). Therefore, they underestimate the number of refixations to the same area of a picture. Thus, both analysis methods that preserve sequential scan path information do not capture the temporal relationship of fixations to the same places sufficiently. 
The goal of the present study is to complement the spatial analysis of eye movements during mental imagery with RQA to assess their temporal dynamics. The participants visually explored pictures and were asked immediately afterwards to imagine them as vividly and accurately as possible while looking at a dark screen. We chose pictures from three different categories (faces, landscape, and art), in which information was distributed differently. In face pictures, we expect participants to restrict their gaze mainly to the eyes, nose and mouth regions (Henderson & Hollingworth, 1998), while in landscape and art pictures, information is spread over larger areas, hence fixations are expected to be more widely distributed. This in turn creates more refixations in face pictures and fewer in landscape and art pictures. For these reasons, the three categories should evoke different eye movements during visual perception, and we expected similar patterns to appear while participants were imagining the pictures. 
Methods
Recurrence parameters
RQA, originally a tool to visually inspect periodicities in dynamic systems (Eckmann, Kamphorst, & Ruelle, 1987), was successfully applied to eye movements by Anderson et al. (2013), Farnand, Vaidyanathan, and Pelz (2016) and Vaidyanathan, Pelz, Alm, Shi, and Haake (2014). Recurrence in eye movements indicates that an area of a picture was revisited and thereby quantifies the patterns of fixations to the same places. Hence, it provides a general description of the temporal dynamics of eye movements. If the eyes repeatedly move to a particular area of a picture, recurrence is high, whereas recurrence is zero if an observer never returns to a previously fixated area. In the recurrence plot, all fixations of a trial and person are displayed on both axes. Every point in the recurrence plot marks two fixations that were close to each other (within predefined threshold distance) and thus directed at the same area of the picture (see Figure 1). Two fixations are considered recurrent if their distance is below a threshold, usually 1°–2° of visual angle corresponding to the foveal area (Anderson et al., 2013; Farnand et al., 2016; Vaidyanathan et al., 2014). Three measures further describe the temporal patterns and relationships between these refixations. 
Figure 1
 
Recurrence plots of simulated data. On both axes, the fixations of a single person and trial are displayed. Points indicate fixations that fell in the same area of the picture. For example, in panel A, the 4th fixation and the 10th were close and they are considered recurrent. Every fixation is ‘recurrent' with itself resulting in a diagonal line showing self-recurrence. Panel A shows maximal determinism where two or more areas of a picture were visited in the same order as they have been visited previously. Panel B shows a case of maximal laminarity, indicating that several subsequent fixations were spent in the same area of a picture that was otherwise inspected only once. Panel C shows a pattern of points close to the line of self-recurrence caused by refixations that happen soon after the initial visit of an area, leading to a low CORM value (center of recurrence mass, large gray point). Large temporal gaps between refixations lead to a high CORM value, as illustrated in panel D.
Figure 1
 
Recurrence plots of simulated data. On both axes, the fixations of a single person and trial are displayed. Points indicate fixations that fell in the same area of the picture. For example, in panel A, the 4th fixation and the 10th were close and they are considered recurrent. Every fixation is ‘recurrent' with itself resulting in a diagonal line showing self-recurrence. Panel A shows maximal determinism where two or more areas of a picture were visited in the same order as they have been visited previously. Panel B shows a case of maximal laminarity, indicating that several subsequent fixations were spent in the same area of a picture that was otherwise inspected only once. Panel C shows a pattern of points close to the line of self-recurrence caused by refixations that happen soon after the initial visit of an area, leading to a low CORM value (center of recurrence mass, large gray point). Large temporal gaps between refixations lead to a high CORM value, as illustrated in panel D.
  1.  
    Determinism represents the proportion of points (recurrent fixations) that fall on diagonals parallel to the line of self-recurrence. If a participant moves the gaze from A to B to C and repeats exactly the same sequence, determinism increases. Panel A of Figure 1 shows simulated data of maximal determinism. For example, three different areas were examined with fixations number 4, 5, and 6 and the same areas were later revisited in the same order with fixations 10, 11, and 12 (gray rectangle in panel A of Figure 1). Thus, determinism indicates that two or more regions were refixated in the same order. High determinism means that many fixation sequences are reenactments of previous sequences. Low determinism in turn means that the order in which areas of a picture are visited is independent of previous orders.
  2.  
    Laminarity indicates the percentage of recurrent fixations that fall on horizontal or vertical lines in the recurrence plot (panel B in Figure 1 shows simulated data where laminarity is maximal). Laminarity refers to the case in which an area is briefly fixated at one point in time and inspected in more detail with several fixations at another time. For example, Figure 1 panel B illustrates that a region was first examined with fixations 2, 3, 4, and 5 and later revisited once with fixation number 8. A region that is fixated once and is examined in detail later leads to vertical lines in the recurrence plot. In panel B in Figure 1 a region of a picture was inspected with the 10th fixation and was later reinspected with fixations number 22, 23, and 24.
  3.  
    The center of recurrence mass (CORM) is a measure for the overall position of recurrence points in the plot. First, the center of recurrence mass is calculated (large gray points in panels C and D in Figure 1). The CORM value then indicates its distance from the diagonal of self-recurrence. This distance is small when repeated fixations generally happen soon after the first visit of an area and thus, points in the recurrence plot are close to the diagonal line of self-recurrence (see panel C in Figure 1). For example, the area visited with the 9th fixation was revisited with fixation number 14, after only five intermediate fixations. If refixations generally happen after many intermediate fixations to other places, the points in the recurrence plot will be far from the diagonal line of self-recurrence and the CORM value will be large, as illustrated by panel D in Figure 1. In this case, re-inspections of an area are separated by a relatively high temporal interval. For example, the area inspected with the 4th fixation in panel D of Figure 1 is revisited after 20 intermediate fixations.
Taken together, RQA measures how often a person returns to a previously inspected area. In addition, it can also quantify temporal scan path properties such as how repetitive fixation sequences are, whether areas are often reinspected by consecutive fixations and whether in general, refixations of an area happen soon or late. 
ScanMatch and MultiMatch
ScanMatch (Cristino et al., 2010) and MultiMatch (Dewhurst et al., 2012) both assess spatial similarity between two fixation sequences. ScanMatch (Cristino et al., 2010) is a grid-based method that segments the screen into different cells. In our study, we used the default of 8 × 12 cells. Each cell is labeled by two letters. These letters code fixations (and their durations) of a trial as a sequence of cell names (Cristino et al., 2010, p. 693). The string sequences of two trials can then be compared in their similarity by assessing how many and what kind of changes are necessary to convert one sequence into the other. Put simply, the fewer such changes are necessary, the more similar two scan paths are. Thus, ScanMatch measures the absolute position of fixations on a screen and simplifies this information by binning it into cells. This simplification can be problematic when two fixations directed at the same object in a scene happen to fall into different cells. This problem is avoided by MultiMatch. 
MultiMatch (Dewhurst et al., 2012) is a vector-based method to compare eye movement sequences. Scan paths are represented by vectors from one fixation to the next and, just like in ScanMatch, the scan paths are simplified (p.1085). Close fixations are grouped together and saccades into the same direction are combined. Two simplified scan paths can then be aligned and compared. Most importantly, this comparison can be based on five different dimensions of the simplified saccade vectors and their accompanying fixations (Shape, Length, Direction, Position, and Duration). 
Taken together, both ScanMatch and MultiMatch assess the similarity between two scan paths based on the absolute position and sequence of fixations. More importantly, both approaches reduce scan path variance (ScanMatch by spatially binning fixations and MultiMatch by grouping close fixations and by combining saccades into the same direction). Thus, both approaches systematically underestimate the frequency of refixations and thus they cannot replace the information about temporal patterns provided by RQA. 
Participants
Forty participants took part in the experiment on a voluntary basis without monetary compensation. One participant guessed that we were interested in similarities of eye movements during mental imagery and perception. Thus, the analysis includes the data of 39 participants, 22 males and 17 females. All participants had normal or corrected-to-normal vision and were told that the main interest of the study was to measure pupil dilation as a function of picture complexity and cognitive load. The study was approved by the Ethical committee of the Human Sciences Faculty of the University of Bern. 
Stimuli
We chose three different picture categories (face, art, landscape) that lead to different eye movement behavior during perception (Anderson et al., 2013; Henderson & Hollingworth, 1998). Participants were presented 20 colored pictures per picture category, resulting in 60 pictures in total. Ten male and 10 female face pictures were chosen from the data set of the European Conference on Visual Perception (2D face sets, 2008). The uniform blue background was extended to fit the dimensions of the screen. Thus, the face stood out as the only object in the picture and hence the fixations in this category are mostly restrained to eyes, nose, and mouth (Hernandez et al., 2009). The landscape pictures were retrieved from the internet with the constraint that the upper half of the picture had to be uniform sky with no clouds or other objects. Here, we expect the wide majority of fixations in the bottom half of the landscape picture. The pictures in the category art were also retrieved from the internet and consisted of scenes that contained several human beings who were located in different parts of the picture. In this category, fixations are expected to be distributed over the entire screen since salient features such as faces can be found on several different places of the image. 
Measures
Eye movements were tracked with the iView RED tracking system (SensoMotoric Instruments, Teltow, Germany) with a precision of 0.5° of visual angle and a sampling rate of 50 Hertz using iView X Software (SensoMotoric Instruments). The device is noninvasive, gaze is tracked via the reflections of infrared light by the eye's lens. The default settings for detection of fixations were used (minimal duration: 80 ms, maximal separation: 100 pixels). Stimuli were presented on a 1280 × 1024-pixel screen using Experiment Center Software (SensoMotoric Instruments). Participants were seated at a distance of 50 cm from the screen, thus stimuli were presented at a visual angle of approximately 38° × 31°. 
Procedure
After providing informed consent, participants were seated in front of the screen for the calibration of the eye tracker and stimulus presentation. Following a short, written introduction presented on the screen, participants completed 60 trials. Each trial consisted of two tasks. At first, participants were asked to freely inspect a picture for 15 s (perception) followed by 15 s of imagining the picture they just saw (imagery). The participants were instructed to imagine the picture as vividly and accurately as possible while keeping their eye open and centered on the screen. A dark screen was chosen for the imagery phase given that Pearson and Clifford (2005) found that brightly lit screens hamper the influence of mental imagery on perception. Upon completion of the experiment, we asked participants what they thought the experiment was testing. Only one participant guessed the purpose of the experiment, and those data were excluded from the analysis. At the end, all participants were debriefed and thanked for their participation. 
Design
For the analysis of fixation durations and their dispersion, we used a 2×3 within design with the factors task (perception, imagery) and picture category (landscape, face, art). To analyze the four RQA-parameters (percentage of recurrent fixations, determinism, laminarity, and CORM) again a 2×3 within design was employed with the factors task (perception, imagery) and picture category (landscape, face, art). For the analysis by means of ScanMatch and MultiMatch, we used a 2×3 within design with the factors comparison type (whether scan paths during mental imagery were compared to perception scan paths or to simulated random scan paths) and picture category (landscape, face, art). Dependent variables were the respective similarity measures. 
Data analysis
Our analysis consists of three parts. In the first part, the dispersion and duration of fixations are analyzed, followed by the results of RQA parameters. Finally, spatial similarities between fixations during visual perception and mental imagery (MultiMatch and ScanMatch) are analyzed. 
Fixations outside of the screen were deleted (1.68% of all fixations). The analysis was conducted using R (R Core Team, 2015) and the lme4 package (Bates, Mächler, Bolker, & Walker, 2015). Recurrence analyses were computed using MATLAB (2015) and the functions made available by Anderson et al. (2013). Two fixations are considered recurrent if their distance is below a predefined threshold of 2° of visual angle subtending the foveal area, as used by Anderson et al. (2013), Farnand et al. (2016), and Vaidyanathan et al. (2014). This angle corresponds to 64 pixels (see Measures section). 
The first two sections of the results address the influence of task (perception, mental imagery) and picture category (landscape, face, art) on fixation properties and RQA parameters. Using the lme4 package in R (Bates et al., 2015), a linear mixed-effects regression was computed on each of the dependent variables (fixation durations, dispersion of fixations, percentage of recurrent fixations, determinism, laminarity, and CORM). In all models, we added a random intercept for stimuli and random intercepts and slopes for participants. Hence, these models allow to control the variance associated with participants and stimuli, and the full information of the data is included in the analysis (Baayen, Davidson, & Bates, 2008; Judd, Westfall, & Kenny, 2012). The F tests of the linear mixed-effects regression models are reported, with degrees of freedom approximated by the method of Kenward-Roger, type III Wald F tests (Luke, 2017). For each model, we report the variance explained only by fixed factors (Display Formula\(\def\upalpha{\unicode[Times]{x3B1}}\)\(\def\upbeta{\unicode[Times]{x3B2}}\)\(\def\upgamma{\unicode[Times]{x3B3}}\)\(\def\updelta{\unicode[Times]{x3B4}}\)\(\def\upvarepsilon{\unicode[Times]{x3B5}}\)\(\def\upzeta{\unicode[Times]{x3B6}}\)\(\def\upeta{\unicode[Times]{x3B7}}\)\(\def\uptheta{\unicode[Times]{x3B8}}\)\(\def\upiota{\unicode[Times]{x3B9}}\)\(\def\upkappa{\unicode[Times]{x3BA}}\)\(\def\uplambda{\unicode[Times]{x3BB}}\)\(\def\upmu{\unicode[Times]{x3BC}}\)\(\def\upnu{\unicode[Times]{x3BD}}\)\(\def\upxi{\unicode[Times]{x3BE}}\)\(\def\upomicron{\unicode[Times]{x3BF}}\)\(\def\uppi{\unicode[Times]{x3C0}}\)\(\def\uprho{\unicode[Times]{x3C1}}\)\(\def\upsigma{\unicode[Times]{x3C3}}\)\(\def\uptau{\unicode[Times]{x3C4}}\)\(\def\upupsilon{\unicode[Times]{x3C5}}\)\(\def\upphi{\unicode[Times]{x3C6}}\)\(\def\upchi{\unicode[Times]{x3C7}}\)\(\def\uppsy{\unicode[Times]{x3C8}}\)\(\def\upomega{\unicode[Times]{x3C9}}\)\(\def\bialpha{\boldsymbol{\alpha}}\)\(\def\bibeta{\boldsymbol{\beta}}\)\(\def\bigamma{\boldsymbol{\gamma}}\)\(\def\bidelta{\boldsymbol{\delta}}\)\(\def\bivarepsilon{\boldsymbol{\varepsilon}}\)\(\def\bizeta{\boldsymbol{\zeta}}\)\(\def\bieta{\boldsymbol{\eta}}\)\(\def\bitheta{\boldsymbol{\theta}}\)\(\def\biiota{\boldsymbol{\iota}}\)\(\def\bikappa{\boldsymbol{\kappa}}\)\(\def\bilambda{\boldsymbol{\lambda}}\)\(\def\bimu{\boldsymbol{\mu}}\)\(\def\binu{\boldsymbol{\nu}}\)\(\def\bixi{\boldsymbol{\xi}}\)\(\def\biomicron{\boldsymbol{\micron}}\)\(\def\bipi{\boldsymbol{\pi}}\)\(\def\birho{\boldsymbol{\rho}}\)\(\def\bisigma{\boldsymbol{\sigma}}\)\(\def\bitau{\boldsymbol{\tau}}\)\(\def\biupsilon{\boldsymbol{\upsilon}}\)\(\def\biphi{\boldsymbol{\phi}}\)\(\def\bichi{\boldsymbol{\chi}}\)\(\def\bipsy{\boldsymbol{\psy}}\)\(\def\biomega{\boldsymbol{\omega}}\)\(\def\bupalpha{\unicode[Times]{x1D6C2}}\)\(\def\bupbeta{\unicode[Times]{x1D6C3}}\)\(\def\bupgamma{\unicode[Times]{x1D6C4}}\)\(\def\bupdelta{\unicode[Times]{x1D6C5}}\)\(\def\bupepsilon{\unicode[Times]{x1D6C6}}\)\(\def\bupvarepsilon{\unicode[Times]{x1D6DC}}\)\(\def\bupzeta{\unicode[Times]{x1D6C7}}\)\(\def\bupeta{\unicode[Times]{x1D6C8}}\)\(\def\buptheta{\unicode[Times]{x1D6C9}}\)\(\def\bupiota{\unicode[Times]{x1D6CA}}\)\(\def\bupkappa{\unicode[Times]{x1D6CB}}\)\(\def\buplambda{\unicode[Times]{x1D6CC}}\)\(\def\bupmu{\unicode[Times]{x1D6CD}}\)\(\def\bupnu{\unicode[Times]{x1D6CE}}\)\(\def\bupxi{\unicode[Times]{x1D6CF}}\)\(\def\bupomicron{\unicode[Times]{x1D6D0}}\)\(\def\buppi{\unicode[Times]{x1D6D1}}\)\(\def\buprho{\unicode[Times]{x1D6D2}}\)\(\def\bupsigma{\unicode[Times]{x1D6D4}}\)\(\def\buptau{\unicode[Times]{x1D6D5}}\)\(\def\bupupsilon{\unicode[Times]{x1D6D6}}\)\(\def\bupphi{\unicode[Times]{x1D6D7}}\)\(\def\bupchi{\unicode[Times]{x1D6D8}}\)\(\def\buppsy{\unicode[Times]{x1D6D9}}\)\(\def\bupomega{\unicode[Times]{x1D6DA}}\)\(\def\bupvartheta{\unicode[Times]{x1D6DD}}\)\(\def\bGamma{\bf{\Gamma}}\)\(\def\bDelta{\bf{\Delta}}\)\(\def\bTheta{\bf{\Theta}}\)\(\def\bLambda{\bf{\Lambda}}\)\(\def\bXi{\bf{\Xi}}\)\(\def\bPi{\bf{\Pi}}\)\(\def\bSigma{\bf{\Sigma}}\)\(\def\bUpsilon{\bf{\Upsilon}}\)\(\def\bPhi{\bf{\Phi}}\)\(\def\bPsi{\bf{\Psi}}\)\(\def\bOmega{\bf{\Omega}}\)\(\def\iGamma{\unicode[Times]{x1D6E4}}\)\(\def\iDelta{\unicode[Times]{x1D6E5}}\)\(\def\iTheta{\unicode[Times]{x1D6E9}}\)\(\def\iLambda{\unicode[Times]{x1D6EC}}\)\(\def\iXi{\unicode[Times]{x1D6EF}}\)\(\def\iPi{\unicode[Times]{x1D6F1}}\)\(\def\iSigma{\unicode[Times]{x1D6F4}}\)\(\def\iUpsilon{\unicode[Times]{x1D6F6}}\)\(\def\iPhi{\unicode[Times]{x1D6F7}}\)\(\def\iPsi{\unicode[Times]{x1D6F9}}\)\(\def\iOmega{\unicode[Times]{x1D6FA}}\)\(\def\biGamma{\unicode[Times]{x1D71E}}\)\(\def\biDelta{\unicode[Times]{x1D71F}}\)\(\def\biTheta{\unicode[Times]{x1D723}}\)\(\def\biLambda{\unicode[Times]{x1D726}}\)\(\def\biXi{\unicode[Times]{x1D729}}\)\(\def\biPi{\unicode[Times]{x1D72B}}\)\(\def\biSigma{\unicode[Times]{x1D72E}}\)\(\def\biUpsilon{\unicode[Times]{x1D730}}\)\(\def\biPhi{\unicode[Times]{x1D731}}\)\(\def\biPsi{\unicode[Times]{x1D733}}\)\(\def\biOmega{\unicode[Times]{x1D734}}\)\(R_m^2,\) marginal pseudo-R-squared) and the total variance explained by the fixed and random factors together (Display Formula\(R_c^2,\) conditional pseudo-R-squared), as suggested by Nakagawa and Schielzeth (2013). To estimate the impact of each factor separately, we report the differences in marginal pseudo-R-squared Display Formula\((R_{\Delta m}^2)\) between the model containing the predictor of interest and a restricted model that does not contain it. That is, Display Formula\(R_{\Delta m}^2\) reports how much more variance is explained by the predictor in question. 
The third section of the results addresses the spatial similarity of eye movements during mental imagery and perception. For each picture category, we computed similarity between imagery and perception scan paths using ScanMatch and MultiMach. Following Dewhurst et al. (2012), we computed similarity between mental imagery and simulated scan paths as a reference to meaningfully interpret the similarity between eye movements during mental imagery and perception. For the simulations, we first computed mean and standard deviations of the number of fixations per participant and picture category in the perception condition. A normal distribution with these means and standard deviations was then used to generate the number of fixations for each trial of the simulation. For the spatial aspects of the simulation, we first fitted a two-dimensional normal distribution to the fixations on the screen, again separately per participant and picture category. For each trial of the simulations, fixations were sampled from this distribution. The simulated fixation data were thus based on the temporal and spatial fixation statistics per participant and picture category in the perception condition and show a bias toward the center of fixations. 
ScanMatch and MultiMatch were both computed in MATLAB using the respective toolboxes provided by Cristino et al. (2010) and Dewhurst et al. (2012), using the respective default parameter values. For each of the similarity scores (one by ScanMatch and five by MultiMatch), a linear mixed-effects model was computed to estimate the effects of comparison type (whether imagery scan paths were compared to perception scan paths or to simulated scan paths) and picture category (landscape, face, art) on the respective similarity measure. We used the lme4 package in R (Bates et al., 2015) and added random intercepts for participants and trials. 
Results
Fixation parameters
We measured fixation durations and quantified the dispersion of fixations over the screen. To obtain fixation dispersions, we calculated the mean of all fixations per participant and trial and then computed the distance between each fixation and this mean. These distances of all fixations from the center of fixations were then aggregated into a median distance for each trial and participant. Finally, the resulting median distance from the mean fixation in pixels was converted into degrees of visual angle. A high value indicates that fixations are generally far apart whereas a low value indicates that fixations are close to each other. 
Fixation duration
Visual inspection of the regression residuals showed heteroscedasticity in that for longer fixations, the statistical model was less accurate. Therefore, the fixation durations were log-transformed and the linear mixed model was rerun. Medians (Mdn) and standard deviations reported here and in Figure 2 are original fixation durations. Task and picture category together explained 38.34% of variance; together with the random effects, 76.21% of variance was accounted for. The task had an influence on the fixation durations, F(1, 39.50) = 121.32, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.37), such that participants made longer fixations (Mdn = 607.69 ms, SD = 497.40) during imagery than during perception (Mdn = 309.38 ms, SD = 86.89). In addition, there was a significant effect of picture category, F(2, 45.86) = 29.50, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.01). Tukey post-hoc tests confirmed (p < 0.001) that participants made the shortest fixations when looking at or imagining art pictures (Mdn = 349.78 ms, SD = 367.99), followed by fixation durations when looking at or imagining pictures of landscapes (Mdn = 369.69 ms, SD = 450.95). The longest fixation durations were elicited by looking at or imagining pictures of faces (Mdn = 404.33 ms, SD = 418.28). The picture category interacted with task (imagery or perception), F(2, 117.34) = 11.53, p < 0.01, Display Formula\(R_{\Delta m}^2\) = 0.00). The very small Display Formula\(R_{\Delta m}^2\) indicates that the inclusion of the interaction term does increase the explained variance by much. It is noteworthy that the variability between participants in the imagery condition is much higher than in the perception condition (SDimagery = 497.40, SDperception = 86.89). This increase in interindividual differences is also found in the parameters of the RQA in the next section. Despite this variability between participants, fixations during mental imagery are generally longer. 
Figure 2
 
Fixation durations as a function of picture category and task. Gray circles and triangles represent individual median fixation durations for all trials in a given combination of factors. Black circles and triangles represent the median of all individual data points in a condition and error bars represent SEM.
Figure 2
 
Fixation durations as a function of picture category and task. Gray circles and triangles represent individual median fixation durations for all trials in a given combination of factors. Black circles and triangles represent the median of all individual data points in a condition and error bars represent SEM.
Dispersion of fixations
Task and picture category together explained 46.32% of variance of fixation dispersion; together with the random effects, 71.27% of variance was explained. There was a significant main effect of task, F(1, 51.37) = 107.29, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.21). Fixations during mental imagery were closer together (M = 3.70, SD = 2.75) while they were more widely distributed during perception (M = 6.87, SD = 3.40); see Figure 3. We also found a significant effect of picture category, F(2, 96.41) = 112.47, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.24). Tukey post-hoc comparisons showed differences between all three picture categories. In art pictures, the distances between fixations were largest (M = 6.99, SD = 3.93), followed by landscape pictures (M = 5.61, SD = 3.19), and face pictures (M = 3.26, SD = 1.87). This result is expected since the three picture categories vary in how the information is distributed over the screen. Face pictures only contained information in the center of the screen whereas the landscape pictures contained information in the lower half of the screen and art pictures had information distributed over the entire screen. 
Figure 3
 
Median distance to the center of fixations in degrees of visual angle. Gray circles and triangles represent the median of the distances for one individual aggregated over all trials in a given condition. Black circles and triangles represent means of the respective condition over all participants and error bars represent the SEM.
Figure 3
 
Median distance to the center of fixations in degrees of visual angle. Gray circles and triangles represent the median of the distances for one individual aggregated over all trials in a given condition. Black circles and triangles represent means of the respective condition over all participants and error bars represent the SEM.
Finally, the interaction between task and picture category was significant, F(2, 114.01) = 86.30, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.06). Tukey post-hoc-tests indicated that during perception, the picture category had a stronger influence on the distribution of fixations than during imagery (all ps < 0.001), although relatively little additional variance is explained by the interaction. Taken together, fixations during mental imagery were generally closer to each other and less dependent on the type of the imagined picture. 
In the next part, the temporal aspects of eye movements during mental imagery and perception will be compared by means of RQA. 
Recurrence quantification analysis
Recurrence
Recurrent fixations are those that are directed at the same place of a picture. A high percentage of recurrent fixations indicates that many fixations are directed at previously inspected areas. Visual inspection of the regression residuals showed heteroscedasticity, thus the recurrence values were log-transformed before the model was fit. F values reported are those for log-transformed data while descriptive statistics and Figure 4 are based on untransformed values. Task and picture category explained 38.47% of variance. Together with the random effects, 69.44% of variance was explained. 
Figure 4
 
Percentage of recurrent fixations during perception, mental imagery, and in simulated scan paths with a central bias. Gray circles and triangles represent individual median recurrence percentages for all trials in one condition. Black circles and triangles represent the mean of all gray points in a condition and error bars represent the SEM.
Figure 4
 
Percentage of recurrent fixations during perception, mental imagery, and in simulated scan paths with a central bias. Gray circles and triangles represent individual median recurrence percentages for all trials in one condition. Black circles and triangles represent the mean of all gray points in a condition and error bars represent the SEM.
The effect of task was significant, F(1, 42.49) = 85.79, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.24, with a higher percentage of recurrent fixations during mental imagery (M = 29.12, SD = 22.40) than during perception (M = 10.93, SD = 8.45); see Figure 4. A significant effect of picture category was also found, F(2, 74.51) = 176.32, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.11. Tukey post-hoc tests showed that all three picture categories lead to different recurrence values: pictures of faces yielded the highest percentage of recurrent fixations (M = 27.68, SD = 20.85) followed by pictures of landscape and finally art (M = 17.07, SD = 17.54 and M = 15.34, SD = 15.34, respectively), (all ps < 0.01). This difference between picture categories is expected given that fixations in face pictures were closest to each other because the information in the picture is concentrated in the middle of the screen. The interaction between the task and picture category was not significant. Thus, mental imagery generally led to more fixations to previously inspected areas and so did seeing or imagining faces. 
The closer fixations are, the more often fixations are considered recurrent (r = −0.70, t = −66.24, df = 4644, p = < 0.001). For this reason, we wanted to confirm that the increase in recurrent fixations during mental imagery was not solely due to fixations being closer. We addressed this issue in two ways. First, we statistically controlled for the dispersion of fixations in all models of recurrence parameters, i.e., we computed a model that contained the dispersion of fixation coordinates as additional fixed predictor of recurrence. Although the dispersion of fixations was a significant predictor of recurrence, F(1, 3976.63) = 3754.42, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.32, the task and the picture category remained significant predictors, F(1, 50.73) = 30.52, p < 0.001 and F(2, 106.68) = 34.22, p < 0.05, respectively. Thus, mental imagery leads to more recurrent fixations even when taking into account the overall distributions of fixations. Second, we computed simulations of imagery data using the method described in the Data analysis section. But here we based the simulation on the temporal and spatial fixation statistics per participant and picture category in the imagery condition. Then we computed a recurrence analysis of the simulated imagery data. If the increase in recurrence were solely due to closer fixations during mental imagery, we would expect the recurrence of simulated imagery scan paths to be as high as the recurrence of real imagery scan paths. This is clearly not the case (see Figure 4). Thus, there are more refixations during mental imagery than would be expected solely by the closer arrangement of fixations in this task. Next, we analyze the refixation patterns using the three RQA measures determinism, laminarity, and CORM. 
Determinism
Determinism quantifies the repetitiveness of fixation sequences. Visual inspection of the regression residuals indicated heteroscedasticity, therefore the determinism values were log-transformed. The reported effects are computed on this basis while descriptive statistics and the data in Figure 5 are based on untransformed values. Task and picture category accounted for 34.49% of variance while task, picture category, and the random effects together accounted for 50.93% of variance was accounted for. Again, the effect of task was significant, F(1, 44.74) = 168.36, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.23, with higher values of determinism during mental imagery (M = 58.32, SD = 23.20) than during perception, M = 32.80, SD = 15.96; see Figure 5. The effect of picture category was small but significant, F(2, 63.45) = 122.54, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.04. A Tukey test (p < 0.001) showed that face pictures had the highest determinism values (M = 54.10, SD = 21.33) whereas landscape pictures (M = 41.37, SD = 24.49) had lower determinism values that were not significantly different from art pictures (M = 40.65, SD = 22.50). 
Figure 5
 
Determinism as a function of picture category and experimental condition (perception, mental imagery, or in simulated scan paths with a central bias). Gray circles and triangles represent individual median determinism values for all trials in a given combination of factors. Black circles and triangles represent the mean of all gray points in a condition and error bars represent the SEM.
Figure 5
 
Determinism as a function of picture category and experimental condition (perception, mental imagery, or in simulated scan paths with a central bias). Gray circles and triangles represent individual median determinism values for all trials in a given combination of factors. Black circles and triangles represent the mean of all gray points in a condition and error bars represent the SEM.
Furthermore, the interaction between picture category and task was small but significant, F(2, 119.23) = 56.37, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.02. To control for the confounding effect of closer fixations during mental imagery, we computed a model that contained the dispersion of fixations as additional fixed predictor. The dispersion of fixations was a significant predictor, F(1, 3058.21) = 431.29, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.09. However, task and picture category remained significant predictors, F(1, 56.24) = 107.62, p < 0.001 and F(2, 75.59) = 34.74, p < 0.001. 
Taken together, determinism values were higher during mental imagery than during perception indicating that fixation sequences during imagery were repeated more than fixation sequences during perception. 
Laminarity
Laminarity represents areas that were either fixated first in a single fixation and then reinspected over consecutive fixations at a later time or were first fixated in detail and then refixated briefly at a later time. Visual inspection of regression residuals indicated heteroscedasticity, thus the following calculations are based on log-transformed laminarity values while descriptive statistics and Figure 6 are based on untransformed values. A total of 27.75% of the variance was explained by the factors task and picture category; 49.88% was explained by the full model that also contained random effects. There was a significant effect of task, F(1, 46.29) = 103.52, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.15, with higher laminarity values during mental imagery (M = 51.62, SD = 22.07) than during perception (M = 29.69 , SD = 15.62). In addition, there was a significant effect of picture category, F(2, 68.71) = 84.32, p < 0.001, Display Formula\(R_{\Delta m}^2\) = -0.002. However, the negative Display Formula\(R_{\Delta m}^2\) indicates that less variance is explained by the fixed factors when picture category is part of the model. The interaction between picture category and task was significant but small, F(2, 119.32) = 33.28, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.02. Tukey post-hoc tests showed that the influence of the task differed for the different picture categories (all ps < 0.001). 
Figure 6
 
Laminarity as a function of picture category and task (perception, mental imagery, or in simulated scan paths with a central bias). Gray circles and triangles represent individual median laminarity values for all trials of a participant. Black circles and triangles represent the mean of all gray points in a condition and error bars represent the SEM.
Figure 6
 
Laminarity as a function of picture category and task (perception, mental imagery, or in simulated scan paths with a central bias). Gray circles and triangles represent individual median laminarity values for all trials of a participant. Black circles and triangles represent the mean of all gray points in a condition and error bars represent the SEM.
We computed a model that contained the dispersion of fixations as additional fixed predictor to control for the effect of closer fixations during mental imagery. The dispersion of fixations was a significant predictor, F(1, 3973.09) = 907.69, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.16, but the task and the picture category remained significant predictors, F(1, 59.65) = 32.88, p < 0.001 and F(2, 75.19) = 20.20, p < 0.001, respectively. 
Thus, during mental imagery participants more frequently explored an area in detail that they had looked at before (or they inspected it in detail at first and looked back afterwards) which lead to higher laminarity values in this condition. 
Center of recurrence mass (CORM)
CORM captures the temporal pattern of recurrent fixations, with small CORM values indicating that refixations tend to occur close in time and large CORM values indicating that refixations tend to occur widely separated in time. The task and picture categories explained 9.78% of the variance; together with the random effects, 28.17% of variance in CORM values were explained. The analysis showed a small but significant effect of task, F(1, 43.59) = 48.45, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.06. The imagery condition was associated with lower CORM values (M = 28.48, SD = 8.28) compared to the perception condition (M = 31.59, SD = 6.18). We also found a significant but small effect for the picture category, F(2, 47.66) = 34.81, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.05. A Tukey HSD test confirmed that CORM values for face pictures were highest (M = 32.18, SD = 6.41), followed by the values of landscape pictures (M = 29.44, SD = 7.21) and finally those of art pictures (M = 28.47, SD = 8.15) were lowest. In addition, a significant but very small interaction was observed, F(2, 118.07) = 8.24, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.004. 
Again, we computed a model that contained the dispersion of fixations as additional fixed predictor controlling for the effect of closer fixations during mental imagery. The dispersion of fixations was a significant predictor, F(1, 3262.66) = 578.03, p < 0.001, Display Formula\(R_{\Delta m}^2\) = 0.14. but the task remained a significant predictor, F(1, 60.03) = 251.35, p < 0.001; the image category was not significant anymore, F(2, 63.35) = 0.57, p = 0.57. 
To summarize, the center of recurrence mass was closer to the line of self-recurrence during mental imagery, indicating that refixations of an area happen sooner after the initial fixation of that area (see Figure 7). Taken together, the fixation parameters as well as RQA parameters show differences in the general spatial and temporal characteristics of eye movements during mental imagery. In the last part of the result section, the spatial similarities between perception and imagery scan paths will be further addressed. 
Figure 7
 
Center of recurrence mass as a function of picture category and task (perception, mental imagery, or in simulated data with a central bias). Gray circles and triangles represent individual median CORM values for all trials of an individual participant. Black circles and triangles represent the mean of all gray points in a condition and error bars represent the SEM.
Figure 7
 
Center of recurrence mass as a function of picture category and task (perception, mental imagery, or in simulated data with a central bias). Gray circles and triangles represent individual median CORM values for all trials of an individual participant. Black circles and triangles represent the mean of all gray points in a condition and error bars represent the SEM.
Spatial methods for analyzing eye fixations
MultiMatch and ScanMatch
Both the MultiMatch (Dewhurst et al., 2012) toolbox and the ScanMatch (Cristino et al., 2010) toolbox for MATLAB provide similarity measures for the comparison of two fixation sequences. While ScanMatch reports a global similarity value, MultiMatch yields five output values to describe similarity with respect to direction, duration, length, position and shape of fixation sequences. Dewhurst et al. (2012) stress the importance of comparing the obtained values against a standard. Following the authors' suggestion, we computed ScanMatch and MultiMatch similarity between imagery and perception for each trial and participant and compared these values with the similarity of imagery and simulated scan paths with a central bias. That is, we measured whether the eye movements during imagery were more similar to those during perception than to eye movements that are randomly distributed around the center. The descriptive statistics of the five MultiMatch values and the global ScanMatch value are shown in Figure 8. The similarity scores separated for the three picture categories are shown in the supplementary material (Supplementary File S1). 
Figure 8
 
Similarities of scan paths according to the five MultiMatch measures and ScanMatch. Fixations made while imagining a scene are compared to those during perception and to simulated random scan paths from a distribution with central bias corresponding to the dispersion of real fixations during perception.
Figure 8
 
Similarities of scan paths according to the five MultiMatch measures and ScanMatch. Fixations made while imagining a scene are compared to those during perception and to simulated random scan paths from a distribution with central bias corresponding to the dispersion of real fixations during perception.
The linear mixed-effects models showed that scan paths during mental imagery are more similar to the corresponding perception scan paths than to simulated scan paths as measured by MultiMatch (see Table 1). Note, however, that the comparison of imagery scan paths and real or simulated scan paths (the factor comparison type) explained little variance. Hence, although statistically significant, the difference in similarity to real versus simulated scan paths is not a good predictor of scan path similarity. Thus, both spatial analysis methods suggest that scan paths during mental imagery resemble simulated fixations centered at the middle of the screen. This result is surprising given the well documented spatial similarities of eye movements during mental imagery and perception (Brandt & Stark, 1997; Johansson et al., 2006; Laeng & Teodorescu, 2002; Martarelli & Mast, 2013; Richardson & Spivey, 2000). 
Table 1
 
Effects of picture category and comparison type on MultiMatch and ScanMatch similarity measures. Notes: F values for the effects of picture category and comparison type on similarity measured by MultiMatch and ScanMatch.
Table 1
 
Effects of picture category and comparison type on MultiMatch and ScanMatch similarity measures. Notes: F values for the effects of picture category and comparison type on similarity measured by MultiMatch and ScanMatch.
AOI analysis
To show that our data is compatible with previously obtained results, we conducted the same AOI-analysis as Laeng and Teodorescu (2002). The time spent in each AOI during perception predicted the time spent there during mental imagery in a linear mixed regression with an unstandardized slope coefficient b = 0.86, F(1, 14.89) = 66.68, p < 0.001, Display Formula\(R_m^2\) = 0.25. Another linear mixed regression showed that the percentage of fixations spent in each AOI during imagery was predicted by the respective percentage of fixations spent in the AOI during perception, b = 0.67, F(1, 59.44) = 213.33, p < 0.001, Display Formula\(R_m^2\) = 0.33. Thus, our results are in line with the findings of Laeng and Teodorescu (2002) showing that eye movements during mental imagery are spatially related to those made during perception. 
Discussion
Eye movements during mental imagery differ in their temporal organization from those made during perception. RQA measures show that participants returned more often to previously fixated areas during mental imagery. They did so sooner than during perception and their scan paths were more repetitive. Moreover, interindividual differences in eye movement behavior were more pronounced during mental imagery than during perception. During perception, scene content remains visible, but during imagery, the imagined scene must be reactivated repeatedly since mental images do not persist and rather fade over time (Bone et al., 2017; De Beni et al., 2007; Farah, 1989; Kosslyn, 1994). This reactivation process leads to more recurrent fixations. We argue that refixations of different regions are in the service of mental image maintenance. Recurrent fixations can be interpreted as traces of the sequential reactivation of parts of a mental image. 
We were able to rule out potential confounds. After statistically controlling for the influence of the dispersion of fixations, the task (perception or mental imagery) remained a significant predictor in all recurrence models. Moreover, the observed recurrence patterns during imagery are clearly higher than those of simulated random scan paths. The differences in recurrence between the three picture categories show that the distinct pattern of temporal organization is related to the process of mental imagery and is not due to the visual input during mental imagery, which remained the same in all three picture categories. Taken together, higher recurrence during imagery is task specific and not simply due to the closer clustering of fixations on uniform screens. 
Interestingly, lower CORM values show fewer fixations to other places between refixations during mental imagery compared to perception where visual information remains available in the peripheral visual field. The higher determinism values during mental imagery show that more fixation sequences are repeated, which has been shown to support visuospatial working memory (Bochynska & Laeng, 2015). As working memory capacity is limited it is unlikely that all information contained in a mental image is generated at once. Rather, we argue that parts of a scene are generated and maintained sequentially (Kosslyn & Shwartz, 1981) and that this process is accompanied by systematic eye movements between different parts. Thus, the results from this study support a rather holistic way of generating a mental image where several regions are rapidly generated and the gaze alternates systematically between them. Such an organizing function of eye movements during mental imagery was already proposed by Hebb (1968). Taken together, our results suggest that RQA is a promising tool that can shed light on the temporal organization of mental imagery. 
Yet another interesting finding were the large individual differences in RQA measures. It appears that individuals vary in their ability to maintain mental images. On the one hand, this could be related to the individual spatial-imagery abilities, since Johansson, Holsanova, and Holmqvist (2011) found that weaker spatial imagery abilities are accompanied by more widely distributed fixations during mental imagery. These distributed gaze patterns would in turn lead to lower overall recurrence values in these participants. On the other hand, the interindividual variation between image maintenance ability could be explained, at least partly, by differences in working memory capacity. For example, a low working memory capacity could lead to a faster decay of the mental image and refixations of inspected areas would have to occur sooner to refresh the imagined content, leading to lower CORM values. It is also possible that low working memory capacity originates in less organized retrieval and maintenance strategies, leading to lower determinism values. Regardless of the origin of the large interindividual differences, the analysis of the temporal characteristics of fixations during mental imagery could be especially sensitive to these differences since the absence of bottom-up constraints on eye movements allows individual differences to unfold their influence more strongly. It would be interesting to further investigate the potential relation between individual working memory capacity, spatial imagery ability and eye movements during mental imagery. In addition to exploring the temporal patterns of eye movements during mental imagery, the AOI analyses of our study confirm previous research showing that fixations during mental imagery are spatially related to those made during perception (Brandt & Stark, 1997; Laeng & Teodorescu, 2002; Martarelli & Mast, 2013; Richardson & Spivey, 2000; Spivey & Geng, 2001). The results also show that fixations during imagery were generally longer, a result that is in line with previous findings (Brandt & Stark, 1997; Recarte & Nunes, 2000). The MultiMatch algorithm shows that imagery scan paths are more similar to those during perception than to simulated scan paths. However, only little variance in scan path similarity is explained by the comparison type. Indeed, the ScanMatch algorithm indicates that scan paths during mental imagery are more similar to simulated fixations centered on the screen when compared to those made during perception. Even though this result is surprising it is in line with the findings from Brandt and Stark (1997) as well as with our finding showing that fixations during mental imagery are more closely clustered. It might be surprising that the crude AOI analysis appears to be more sensitive to the similarity between imagery and perception eye movements than the more elaborate scan path comparison methods. This apparent contradiction is resolved by keeping in mind that the AOI analysis and the scan path comparison algorithms measure two different things. While AOI analysis reflects the overall spatial localization of fixations, the scan path comparison algorithms analyze the sequence in which these fixations happen. This observation is consistent with Anderson et al.'s (2013) suggestion that RQA should be used to complement other methods for fixation analysis, not to replace them. Our results suggest that during mental imagery, we return to areas visited during perception but by means of different fixation sequences. Thus, the spatial relation between eye movements during mental imagery and perception seems to be more complex than a simple straightforward correspondence suggested by the AOI analyses. Future research is needed to more thoroughly investigate which spatial aspects of eye movements during perception are reenacted during mental imagery and which are not. This is particularly important, because these spatial aspects of eye movements during mental imagery reflect the spatial organization of the mental image itself while the temporal organization of these eye movements reflects the process by which the mental image is generated and maintained. Both aspects, the exact spatial organization and the generation/maintenance of mental images need to be linked in future research to gain a more complete understanding. 
Our study has several limitations. First, we did not measure mental imagery performance for example by means of error rates and response times. Future experiments will need to address this issue by relating recurrence parameters more directly to performance. Furthermore, we chose a rather broad and descriptive approach by employing three methods of scan path analysis that have—to our knowledge—not been used in previous mental imagery research. Because of the exploratory approach, the interpretations of the differences in temporal patterns of eye movements require further empirical testing. Nevertheless, we were able to show that the data from this study is compatible with the data of previous experiments. 
In sum, RQA can be a vital part of imagery research. The majority of research on mental imagery has focused on similarity between imagery and perception, either in terms of the representations used (Kosslyn, 1994; Mellet, Petit, Mazoyer, Denis, & Tzourio, 1998; Pearson & Kosslyn, 2015; Pylyshyn, 2002) or in activated brain regions (Fletcher et al., 1995; Ganis, Thompson, & Kosslyn, 2004; Mellet et al., 2002; Slotnick, Thompson, & Kosslyn, 2005). In addition to this, a classifier trained on brain patterns during perception can predict the content of mental imagery when provided with brain activation patterns recorded during mental imagery (Albers, Kok, Toni, Dijkerman, & Lange, 2013). However, still comparatively little attention has so far been devoted to the possible characteristics that differentiate mental imagery from perception. This is surprising since similarities alone cannot sufficiently describe mental imagery. Not only do we rarely confuse images with percepts (Mast, 2005), but double dissociations have also been demonstrated in multiple clinical cases (e.g., Kosslyn, Holtzman, Farah, and Gazzaniga, 1985; Mellet et al., 1998). RQA is a useful tool to investigate the temporal dynamics of mental imagery, and this can substantially improve the understanding of the mechanisms that underlie mental imagery beyond the well documented spatial similarities with perception. 
Conclusion
Our results show that fixations during mental imagery are longer and more clustered. Eye gaze returns more often and sooner to previously inspected areas and interindividual variance in temporal dynamics is more pronounced during mental imagery. These results emphasize differences in temporal organization between mental imagery and perception that were neglected in previous research. 
Acknowledgments
We thank all participants of the study and Abimanju Subramaniam for his help during data collection. 
Commercial relationships: none. 
Corresponding author: Lilla M. Gurtner 
Address: Department of Psychology, University of Bern, Bern, Switzerland. 
References
2D face sets. (2008). Retrieved from http://pics.stir.ac.uk/2D_face_sets.htm [Data set].
Albers, A. M., Kok, P., Toni, I., Dijkerman, H. C., & Lange, F. P. de. (2013). Shared representations for working memory and mental imagery in early visual cortex. Current Biology, 23 (15), 1427–1431, https://doi.org/10.1016/j.cub.2013.05.065.
Anderson, N. C., Anderson, F., Kingstone, A., & Bischof, W. F. (2015). A comparison of scanpath comparison methods. Behavior Research Methods, 47 (4), 1377–1392, https://doi.org/10.3758/s13428-014-0550-3.
Anderson, N. C., Bischof, W. F., Laidlaw, E. W. K., Risko, F. E., & Kingstone, A. (2013). Recurrence quantification analysis of eye movements. Behavior Research Methods, 45, 842–856, https://doi.org/10.3758/s13428-012-0299-5.
Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59 (4), 390–412, https://doi.org/10.1016/j.jml.2007.12.005.
Bates, D. M., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67 (1), 1–48, https://doi.org/10.18637/jss.v067.i01.
Bochynska, A., & Laeng, B. (2015). Tracking down the path of memory: Eye scanpaths facilitate retrieval of visuospatial information. Cognitive Processing, 16 (1), 159–163, https://doi.org/10.1007/s10339-015-0690-0.
Bone, M. B., St-Laurent, M., Dang, C., McQuiggan, D. A., Ryan, J. D., & Buchsbaum, B. R. (2017). Eye-movement reinstatement and neural reactivation during mental imagery. BioRxiv. Retrieved from http://biorxiv.org/content/early/2017/05/23/107953.abstract.
Brandt, S., & Stark, L. (1997). Spontaneous eye movements during visual imagery reflect the content of the visual scene. Journal of Cognitive Neuroscience, 9 (1), 27–38, https://doi.org/10.1162/jocn.1997.9.1.27.
Cristino, F., Mathôt, S., Theeuwes, J., & Gilchrist, I. D. (2010). ScanMatch: A novel method for comparing fixation sequences. Behavior Research Methods, 42 (3), 692–700, https://doi.org/10.3758/BRM.42.3.692.
De Beni, R., Pazzaglia, F., & Gardini, S. (2007). The generation and maintenance of visual mental images: Evidence from image type and aging. Brain and Cognition, 63 (3), 271–278, https://doi.org/10.1016/j.bandc.2006.09.004.
Dewhurst, R., Nyström, M., Jarodzka, H., Foulsham, T., Johansson, R., & Holmqvist, K. (2012). It depends on how you look at it: Scanpath comparison in multiple dimensions with MultiMatch, a vector-based approach. Behavior Research Methods, 44 (4), 1079–1100, https://doi.org/10.3758/s13428-012-0212-2.
Eckmann, J. P., Kamphorst, S. O., & Ruelle, D. (1987). Recurrence plots of dynamical systems. EPL (Europhysics Letters), 4 (9), 973, https://doi.org/10.1209/0295-5075/4/9/004.
Farah, M. J. (1989). The neural basis of mental imagery. Trends in Neurosciences, 12 (10), 395–399, https://doi.org/10.1016/0166-2236(89)90079-9.
Farnand, S., Vaidyanathan, P., & Pelz, J. (2016). Recurrence metrics for eye movements in perceptual experiments. Journal of Eye Movement Research, 9 (4), 1–12, https://doi.org/10.16910/jemr.9.4.1.
Fletcher, P. C., Frith, C. D., Baker, S. C., Shallice, T., Frackowiak, R. S., & Dolan, R. J. (1995). The mind's eye—precuneus activation in memory-related imagery. NeuroImage, 2 (3), 195–200, https://doi.org/10.1006/nimg.1995.1025.
Ganis, G., Thompson, W. L., & Kosslyn, S. M. (2004). Brain areas underlying visual mental imagery and visual perception: An fMRI study. Cognitive Brain Research, 20 (2), 226–241, https://doi.org/10.1016/j.cogbrainres.2004.02.012.
Hebb, D. O. (1968). Concerning imagery. Psychological Review, 75 (6), 466–477, https://doi.org/10.1037/h0026771.
Henderson, J. M., & Hollingworth, A. (1998). Eye movements during scene viewing: An overview. In Underwood G. (Ed.), Eye guidance in reading and scene perception (pp. 269–293). Amsterdam: Elsevier Science Ltd, https://doi.org/10.1016/B978-008043361-5/50013-4.
Hernandez, N., Metzger, A., Magne, R., Bonnet-Brilhault, F., Roux, S., Barthelemy, C., & Martineau, J. (2009). Exploration of core features of a human face by healthy and autistic adults analyzed by visual scanning. Neuropsychologia, 47 (4), 1004–1012, https://doi.org/10.1016/j.neuropsychologia.2008.10.023.
Johansson, R., & Johansson, M. (2014). Look here, eye movements play a functional role in memory retrieval. Psychological Science, 25 (1), 236–242, https://doi.org/10.1177/0956797613498260.
Johansson, R., Holsanova, J., & Holmqvist, K. (2006). Pictures and spoken descriptions elicit similar eye movements during mental imagery, both in light and in complete darkness. Cognitive Science, 30 (6), 1053–1079, https://doi.org/10.1207/s15516709cog0000_86.
Johansson, R., Holsanova, J., & Holmqvist, K. (2011). The dispersion of eye movements during visual imagery is related to individual differences in spatial imagery ability. In Proceedings of the 33rd annual meeting of the cognitive science society (pp. 1200–1205). Austin, TX: Cognitive Science Society. Retrieved from https://mindmodeling.org/cogsci2011/papers/0284/paper0284.pdf.
Judd, C. M., Westfall, J., & Kenny, D. A. (2012). Treating stimuli as a random factor in social psychology: A new and comprehensive solution to a pervasive but largely ignored problem. Journal of Personality and Social Psychology, 103 (1), 54–69, https://doi.org/10.1037/a0028347.
Kosslyn, S. M. (1994). Image and Brain: The Resolution of the Imagery Debate. Cambridge, MA: MIT Press.
Kosslyn, S. M., & Shwartz, S. P. (1981). Empirical constrains of theories of visual mental imagery. Attention and Performance IX, 241–260.
Kosslyn, S. M., Holtzman, J. D., Farah, M. J., & Gazzaniga, M. S. (1985). A computational analysis of mental image generation. Evidence from functional dissociations in split-brain patients. Journal of Experimental Psychology: General, 114 (3), 311–341, https://doi.org/10.1037/0096-3445.114.3.311.
Laeng, B., & Teodorescu, D.-S. (2002). Eye scanpaths during visual imagery reenact those of perception of the same visual scene. Cognitive Science, 26, 207–231, https://doi.org/10.1207/s15516709cog2602_3.
Luke, S. G. (2017). Evaluating significance in linear mixed-effects models in R. Behavior Research Methods, 49 (4), 1494–1502, https://doi.org/10.3758/s13428-016-0809-y.
Martarelli, C. S., & Mast, F. W. (2011). Preschool children's eye-movements during pictorial recall. British Journal of Developmental Psychology, 29 (3), 425–436, https://doi.org/10.1348/026151010X495844.
Martarelli, C. S., & Mast, F. W. (2013). Eye movements during long-term pictorial recall. Psychological Research, 77 (3), 303–309, https://doi.org/10.1007/s00426-012-0439-7.
Mast, F. W. (2005). Mental images: Always present, never there. Behavioral and Brain Sciences, 28 (6), 769–770, https://doi.org/10.1017/S0140525X05340131.
MATLAB. (2015). version 8.5.0.197613 (R2015a). Natick, MA: The Mathworks, Inc.
Mellet, E., Bricogne, S., Crivello, F., Mazoyer, B., Denis, M., & Tzourio-Mazoyer, N. (2002). Neural basis of mental scanning of a topographic representation built from a text. Cerebral Cortex, 12 (12), 1322–1330, https://doi.org/10.1093/cercor/12.12.1322.
Mellet, E., Petit, L., Mazoyer, B., Denis, M., & Tzourio, N. (1998). Reopening the mental imagery debate: Lessons from functional anatomy. NeuroImage, 8 (2), 129–139, https://doi.org/10.1006/nimg.1998.0355.
Nakagawa, S., & Schielzeth, H. (2013). A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods in Ecology and Evolution, 4 (2), 133–142, https://doi.org/10.1111/j.2041-210x.2012.00261.x.
Olsen, R. K., Chiew, M., Buchsbaum, B. R., & Ryan, J. D. (2014). The relationship between delay period eye movements and visuospatial memory. Journal of Vision, 14 (1): 8, 1–11, https://doi.org/10.1167/14.1.8. [PubMed] [Article]
Pearson, J., & Clifford, C. W. (2005). Mechanisms selectively engaged in rivalry: Normal vision habituates, rivalrous vision primes. Vision Research, 45 (6), 707–714, https://doi.org/10.1016/j.visres.2004.09.040.
Pearson, J., & Kosslyn, S. M. (2015). The heterogeneity of mental representation: Ending the imagery debate. In Proceedings of the national academy of sciences (Vol. 112, pp. 10089–10092), https://doi.org/10.1073/pnas.1504933112.
Postle, B. R., Idzikowski, C., Della Sala, S., Logie, R. H., & Baddeley, A. D. (2006). The selective disruption of spatial working memory by eye movements. Quarterly Journal of Experimental Psychology, 59 (1), 100–120, https://doi.org/10.1080/17470210500151410.
Pylyshyn, Z. W. (2002). Mental imagery: In search of a theory. Behavioral and Brain Sciences, 25, 157–238.
R Core Team. (2015). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from http://www.r-project.org/.
Recarte, M. A., & Nunes, L. M. (2000). Effects of verbal and spatial-imagery tasks on eye fixations while driving. Journal of Experimental Psychology: Applied, 6 (1), 31–43, https://doi.org/10.1037/1076-898X.6.1.31.
Richardson, D. C., & Spivey, M. J. (2000). Representation, space and Hollywood Squares: Looking at things that aren't there anymore. Cognition, 76 (3), 269–295, https://doi.org/10.1016/S0010-0277(00)00084-6.
Slotnick, S. D., Thompson, W. L., & Kosslyn, S. M. (2005). Visual mental imagery induces retinotopically organized activation of early visual areas. Cerebral Cortex, 15 (10), 1570–1583, https://doi.org/10.1093/cercor/bhi035.
Spivey, M. J., & Geng, J. J. (2001). Oculomotor mechanisms activated by imagery and memory: Eye movements to absent objects. Psychological Research, 65 (4), 235–241, https://doi.org/10.1007/s004260100059.
Totten, E. (1935). Eye-movement during visual imagery. Baltimore: Johns Hopkins Press.
Vaidyanathan, P., Pelz, J., Alm, C., Shi, P., & Haake, A. (2014). Recurrence quantification analysis reveals eye-movement behavior differences between experts and novices. In Proceedings of the symposium on eye tracking research and applications (pp. 303–306). Safety Harbor, Florida: ACM, https://doi.org/10.1145/2578153.2578207.
Figure 1
 
Recurrence plots of simulated data. On both axes, the fixations of a single person and trial are displayed. Points indicate fixations that fell in the same area of the picture. For example, in panel A, the 4th fixation and the 10th were close and they are considered recurrent. Every fixation is ‘recurrent' with itself resulting in a diagonal line showing self-recurrence. Panel A shows maximal determinism where two or more areas of a picture were visited in the same order as they have been visited previously. Panel B shows a case of maximal laminarity, indicating that several subsequent fixations were spent in the same area of a picture that was otherwise inspected only once. Panel C shows a pattern of points close to the line of self-recurrence caused by refixations that happen soon after the initial visit of an area, leading to a low CORM value (center of recurrence mass, large gray point). Large temporal gaps between refixations lead to a high CORM value, as illustrated in panel D.
Figure 1
 
Recurrence plots of simulated data. On both axes, the fixations of a single person and trial are displayed. Points indicate fixations that fell in the same area of the picture. For example, in panel A, the 4th fixation and the 10th were close and they are considered recurrent. Every fixation is ‘recurrent' with itself resulting in a diagonal line showing self-recurrence. Panel A shows maximal determinism where two or more areas of a picture were visited in the same order as they have been visited previously. Panel B shows a case of maximal laminarity, indicating that several subsequent fixations were spent in the same area of a picture that was otherwise inspected only once. Panel C shows a pattern of points close to the line of self-recurrence caused by refixations that happen soon after the initial visit of an area, leading to a low CORM value (center of recurrence mass, large gray point). Large temporal gaps between refixations lead to a high CORM value, as illustrated in panel D.
Figure 2
 
Fixation durations as a function of picture category and task. Gray circles and triangles represent individual median fixation durations for all trials in a given combination of factors. Black circles and triangles represent the median of all individual data points in a condition and error bars represent SEM.
Figure 2
 
Fixation durations as a function of picture category and task. Gray circles and triangles represent individual median fixation durations for all trials in a given combination of factors. Black circles and triangles represent the median of all individual data points in a condition and error bars represent SEM.
Figure 3
 
Median distance to the center of fixations in degrees of visual angle. Gray circles and triangles represent the median of the distances for one individual aggregated over all trials in a given condition. Black circles and triangles represent means of the respective condition over all participants and error bars represent the SEM.
Figure 3
 
Median distance to the center of fixations in degrees of visual angle. Gray circles and triangles represent the median of the distances for one individual aggregated over all trials in a given condition. Black circles and triangles represent means of the respective condition over all participants and error bars represent the SEM.
Figure 4
 
Percentage of recurrent fixations during perception, mental imagery, and in simulated scan paths with a central bias. Gray circles and triangles represent individual median recurrence percentages for all trials in one condition. Black circles and triangles represent the mean of all gray points in a condition and error bars represent the SEM.
Figure 4
 
Percentage of recurrent fixations during perception, mental imagery, and in simulated scan paths with a central bias. Gray circles and triangles represent individual median recurrence percentages for all trials in one condition. Black circles and triangles represent the mean of all gray points in a condition and error bars represent the SEM.
Figure 5
 
Determinism as a function of picture category and experimental condition (perception, mental imagery, or in simulated scan paths with a central bias). Gray circles and triangles represent individual median determinism values for all trials in a given combination of factors. Black circles and triangles represent the mean of all gray points in a condition and error bars represent the SEM.
Figure 5
 
Determinism as a function of picture category and experimental condition (perception, mental imagery, or in simulated scan paths with a central bias). Gray circles and triangles represent individual median determinism values for all trials in a given combination of factors. Black circles and triangles represent the mean of all gray points in a condition and error bars represent the SEM.
Figure 6
 
Laminarity as a function of picture category and task (perception, mental imagery, or in simulated scan paths with a central bias). Gray circles and triangles represent individual median laminarity values for all trials of a participant. Black circles and triangles represent the mean of all gray points in a condition and error bars represent the SEM.
Figure 6
 
Laminarity as a function of picture category and task (perception, mental imagery, or in simulated scan paths with a central bias). Gray circles and triangles represent individual median laminarity values for all trials of a participant. Black circles and triangles represent the mean of all gray points in a condition and error bars represent the SEM.
Figure 7
 
Center of recurrence mass as a function of picture category and task (perception, mental imagery, or in simulated data with a central bias). Gray circles and triangles represent individual median CORM values for all trials of an individual participant. Black circles and triangles represent the mean of all gray points in a condition and error bars represent the SEM.
Figure 7
 
Center of recurrence mass as a function of picture category and task (perception, mental imagery, or in simulated data with a central bias). Gray circles and triangles represent individual median CORM values for all trials of an individual participant. Black circles and triangles represent the mean of all gray points in a condition and error bars represent the SEM.
Figure 8
 
Similarities of scan paths according to the five MultiMatch measures and ScanMatch. Fixations made while imagining a scene are compared to those during perception and to simulated random scan paths from a distribution with central bias corresponding to the dispersion of real fixations during perception.
Figure 8
 
Similarities of scan paths according to the five MultiMatch measures and ScanMatch. Fixations made while imagining a scene are compared to those during perception and to simulated random scan paths from a distribution with central bias corresponding to the dispersion of real fixations during perception.
Table 1
 
Effects of picture category and comparison type on MultiMatch and ScanMatch similarity measures. Notes: F values for the effects of picture category and comparison type on similarity measured by MultiMatch and ScanMatch.
Table 1
 
Effects of picture category and comparison type on MultiMatch and ScanMatch similarity measures. Notes: F values for the effects of picture category and comparison type on similarity measured by MultiMatch and ScanMatch.
Supplement 1
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×