Free
Article  |   September 2011
A novel method for analyzing sequential eye movements reveals strategic influence on Raven's Advanced Progressive Matrices
Author Affiliations
Journal of Vision September 2011, Vol.11, 10. doi:10.1167/11.10.10
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to Subscribers Only
      Sign In or Create an Account ×
    • Get Citation

      Taylor R. Hayes, Alexander A. Petrov, Per B. Sederberg; A novel method for analyzing sequential eye movements reveals strategic influence on Raven's Advanced Progressive Matrices. Journal of Vision 2011;11(10):10. doi: 10.1167/11.10.10.

      Download citation file:


      © 2015 Association for Research in Vision and Ophthalmology.

      ×
  • Supplements
Abstract

Eye movements are an important data source in vision science. However, the vast majority of eye movement studies ignore sequential information in the data and utilize only first-order statistics. Here, we present a novel application of a temporal-difference learning algorithm to construct a scanpath successor representation (SR; P. Dayan, 1993) that captures statistical regularities in temporally extended eye movement sequences. We demonstrate the effectiveness of the scanpath SR on eye movement data from participants solving items from Raven's Advanced Progressive Matrices Test. Analysis of the SRs revealed individual differences in scanning patterns captured by two principal components that predicted individual Raven scores much better than existing methods. These scanpath SR components were highly interpretable and provided new insight into the role of strategic processing on the Raven test. The success of the scanpath SR in terms of prediction and interpretability suggests that this method could prove useful in a much broader context.

Introduction
Eye movement protocols are an important data source in vision science and psychology (e.g., Buswell, 1935; Yarbus, 1967) and have advanced our knowledge of visual search, scene perception, development, human–computer interaction, reading, and many other fields (see, e.g., Findlay & Gilchrist, 2003; Rayner, 1998 for reviews). Despite this success, the vast majority of eye movement studies have ignored all sequential information in the data and utilized only first-order statistics such as fixation probabilities and dwell times. Although fixation sequences (or scanpaths, Stark & Ellis, 1981) often contain valuable information about underlying cognitive processes, they are difficult to quantify and interpret, and this has traditionally prevented eye-tracking researchers from including them in their analyses. 
Why are scanpaths so difficult to analyze? The fundamental reason is that the number of possible scanpaths grows exponentially with their length. To illustrate, suppose the display is divided into 10 areas of interest (AOIs). Then, there are 10 scanpaths of length 1 but 590,490 (=10 × 95) scanpaths of length 6. The challenge is to tame this combinatorial explosion without losing the sequential information in the process. The existing methods for doing this can be classified into two broad classes. One approach represents scanpaths as strings of letters and uses string-editing distance—the number of additions and subtractions necessary to turn one sequence of letters into another—as a dissimilarity metric (e.g., Brandt & Stark, 1997; Myers & Schoelles, 2005). Yet, string-editing measures have a number of limitations, with the most critical being that they are best suited for comparing short sequences of similar length, making it difficult to infer cognitive states or strategies in temporally extended tasks or to compare across participants or trials that differ in duration. 
Another approach is based on transition probability matrices (e.g., Ellis & Stark, 1986; Ponsoda, Scott, & Findlay, 1995) and Markov models (e.g., Jansen, Marriott, & Yelland, 2007; Salvucci & Anderson, 2001; Simola, Salojärvi, & Kojo, 2008), which can be used to extract and compare regularities in scanpaths of varying length. This method also has limitations. While a transition matrix provides a relatively simple representation of scanpath information (a fixed-size matrix), it only estimates the conditional probabilities of scanpaths of length 2. That is, given a current fixation on one AOI, what is the probability to visit each of the other AOIs on the next fixation? This is a very limited event horizon—reaching only one step into the future. Higher order transition matrices extend the horizon to two steps (or more), but there seldom are enough data to provide accurate estimates of the (exponentially growing number of) higher order probabilities. Hidden Markov models (HMMs, e.g., Rabiner, 1989) deal with the combinatorial explosion by factoring the joint probability density into smaller, more manageable pieces using conditional independence assumptions. When these assumptions are met, HMMs have been applied successfully in eye movement data analysis (e.g., Cagli, Coraggio, Napoletano, & Boccignone, 2008; Salvucci & Anderson, 2001; Simola et al., 2008; van der Lans, Pieters, & Wedel, 2008) and active computer vision (e.g., Rimey & Brown, 1991). The factorization is formalized in a graphical model whose parameters are then estimated from data via sophisticated algorithms such as Markov chain Monte Carlo (e.g., Scott, 2002; van der Lans et al., 2008). This makes the development of an HMM a slow and laborious process that requires domain knowledge and considerable expertise. This method seems ill suited for exploratory data analysis in domains where the underlying factorization is not known. 
The present article pioneers the use of reinforcement learning algorithms to capture temporally extended sequential information in eye movement protocols. We present a novel application of a temporal-difference learning algorithm (Sutton, 1988; Sutton & Barto, 1998) to construct a successor representation (SR; Dayan, 1993; White, 1995) of an eye movement sequence that keeps the simplicity of the fixed-size transition matrix and extends the event horizon. The key idea is that upon observing a transition from one AOI to another, instead of simply updating the transition probability from the first to the second AOI, we associate the first AOI with the second AOI and all expected subsequent AOIs based on prior visits to the second AOI. This is equivalent to learning to predict future scanpaths based on past scanpaths. After traversing the entire fixation sequence for a trial, the resulting SR can be conceptualized as having extracted the statistical regularities in temporally extended scanpaths, collapsing the information into a fixed-size matrix. Specifically, an SR matrix contains, for each AOI, the temporally discounted number of expected future fixations to all AOIs (Dayan, 1993). Given their uniform size, the SR matrices from different observers and/or trials can be analyzed using standard statistical methods to identify significant regularities for various comparisons of interest. The new method is very well suited for exploratory data analysis. 
To demonstrate the effectiveness of the scanpath SR as an exploratory tool, we apply this method to discern individual differences in problem solving strategies on a benchmark test of fluid intellectual ability, Raven's Advanced Progressive Matrices (APM; Raven, Raven, & Court, 1998). The Raven APM is a geometric analogy test with excellent psychometric properties (Brouwers, Van de Viver, & Van Hemert, 2009) that has, for 70 years, been a popular and trusted instrument in clinical (e.g., Soulieres et al., 2009), developmental (e.g., Eslinger et al., 2009), and cognitive (e.g., Gray, Chabris, & Braver, 2003) psychology. As we report in the Results and discussion section below, the SR analysis allows us to predict individual Raven scores with unprecedented precision from the eye movement data. In the process, the scanpath SR also yields new theoretical insights about Raven problem solving strategies. 
We can evaluate scanning patterns to predict Raven scores because both measures correlate with a third, hidden variable strategy. Individuals differ in their problem solving strategies and this is detectable in their eye movements (e.g., Just & Carpenter, 1985). A Raven problem consists of a matrix and 8 response alternatives (Figure 1, left). Two strategies are particularly relevant for such problems (Snow, 1980). In constructive matching, the participant tries to formulate the missing element based exclusively on matrix information, and then looks for that element in the response area. In response elimination, each alternative is inspected in turn and evaluated whether it fits into the empty matrix slot. The former strategy tends to occur in high-scoring individuals and/or easier problems, the latter in low-scoring individuals and/or difficult problems (Bethell-Fox, Lohman, & Snow, 1984; Vigneau, Caissie, & Bors, 2006). We will show that the scanpath SR identifies the degree to which participants apply each of these two strategies. This can then be used to predict the individual scores on the Raven task. 
Figure 1
 
Example of the Raven problem format and trial sequence. (Left) The problem matrix and the 8 response alternatives are shown with solid lines. The height of the rectangular box around the matrix subtended 9 degrees of visual angle. Eye fixations were assigned to 10 areas of interest (AOIs) as indicated by dotted lines: nine for the matrix cells (top row = 1–3, middle = 4–6, bottom = 7–9) and one for the entire response area. (Right) Each trial had three phases: fixation, solution, and response. Participants fixated for 1 s. Eye movements and verbal protocols were collected during the solution phase. Moving the mouse cursor out of the fixation box triggered the response phase, during which the problem matrix was masked and the participant clicked on their chosen answer. The intertrial interval (ITI) was 200 ms. (This problem was generated by the authors to protect the security of the standardized test.)
Figure 1
 
Example of the Raven problem format and trial sequence. (Left) The problem matrix and the 8 response alternatives are shown with solid lines. The height of the rectangular box around the matrix subtended 9 degrees of visual angle. Eye fixations were assigned to 10 areas of interest (AOIs) as indicated by dotted lines: nine for the matrix cells (top row = 1–3, middle = 4–6, bottom = 7–9) and one for the entire response area. (Right) Each trial had three phases: fixation, solution, and response. Participants fixated for 1 s. Eye movements and verbal protocols were collected during the solution phase. Moving the mouse cursor out of the fixation box triggered the response phase, during which the problem matrix was masked and the participant clicked on their chosen answer. The intertrial interval (ITI) was 200 ms. (This problem was generated by the authors to protect the security of the standardized test.)
Experiment
Methods
Thirty-five university students with normal or corrected-to-normal vision completed 28 problems from Raven's Advanced Progressive Matrices, Set II (Raven et al., 1998) on two separate days approximately a week apart. The participants were paid $6 per hour plus $1 bonus for each correct answer. Half of them completed items 2, 4, 6, 9, 10, 11, 16, 17, 19, 21, 23, 24, 26, and 29 on the first session and 1, 3, 5, 7, 12, 13, 14, 15, 18, 20, 22, 25, 27, and 28 on the second. The other half completed the same subsets in the opposite order. The instructions followed the Raven APM Manual guidelines for individual test administration (Raven et al., 1998). 
A chin rest was located ≈92 cm away from the 21″ CRT monitor in a darkened room. Each trial began with a brief alert sound and a fixation cross appeared in the middle of the screen (Figure 1, right). After the participant fixated for 1 s, which allowed for equipment recalibration, the Raven problem appeared and the participant had unlimited time to work on it. A mouse click on one of the responses ended the trial. 
Eye-tracking data and “think aloud” protocols (Ericsson & Simon, 1993) were collected on both sessions. Between these main sessions, 23 of the participants completed two additional sessions of paper-and-pencil practice on Raven-like problems (Matzen et al., 2010). This manipulation had no statistically significant effect relative to a control group 1 (F(2,32) = 0.15, p = 0.86). Therefore, we analyzed the test data of all 35 participants together. The paper-and-pencil data and the verbal protocols are beyond the scope of this article. 
Sequential eye movement analysis
Participants' eye movements were recorded using an Eyelink 1000 desktop eye tracker (SR Research, 2006) at a sampling rate of 1000 Hz. Saccades and fixations were segmented with Eyelink's standard algorithm using velocity and acceleration thresholds (SR Research, 2006). Each fixation was assigned to one of the 10 AOIs depicted in Figure 1. The few (<1%) fixations that fell outside of the 10 designated areas were ignored. A single AOI (labeled R) covered the entire response area so that the spatial layout of the answers could not be used to predict the participants' scores. 
We defined a scanpath as the sequence of fixations across the 10 different AOIs on a given trial. The sequences varied widely in length across participants and trials. In an effort to reduce this variability, we clipped 20% from the beginning of each sequence longer than 100 fixations. If the clipped length still exceeded 100, we also clipped 5% from the end. The median length of the clipped scanpaths used in the analyses was 88 fixations (min = 14, max ≈ 1000, IQR = 69). The clipping also helped isolate the period of active problem solving, given that the early fixations tended to survey the matrix and the last few verified the chosen answer. 
The next step was to calculate the successor representation (SR; Dayan, 1993) for each scanpath. We used a temporal-difference learning algorithm to extract long-range statistical regularities from the sequence. The algorithm treats each scanpath as a first-order Markov chain with the 10 AOIs comprising a discrete, finite state space (Dayan, 1993; White, 1995). The algorithm is incremental and builds a 10 × 10 SR matrix M . The matrix is initialized with zeros and then updated for each transition in the sequence. Consider a transition from state i to state j. The ith column of the matrix—the column corresponding to the “sender” AOI—is updated according to 
Δ M i = α ( I j + γ M j M i ) ,
(1)
where I is the identity matrix, each subscript picks a column in a matrix, α is a learning rate parameter (0 < α < 1), and γ is a temporal discount factor (0 < γ < 1). In words, upon observing a transition ij, the set of expected successors ( M i ) for the sender i is updated to include the receiver j (represented as a unit column vector I j ) and the predicted set of successors ( M j ) for the new location j, discounted by γ. The latter term is the key to extending the event horizon to encompass both immediate and long-range transitions—it includes the discounted future states in the prediction from the current state. For example, suppose a participant scans the top row of a Raven problem systematically from left to right: 1→2→3→1→2…. Then, the successors of location 1 will include both location 2 and, weighted by γ, location 3. By contrast, a first-order transition matrix would include only the association between 1 and 2. After traversing the whole scanpath, the estimated SR matrix approximates the ideal SR matrix, which contains the temporally discounted number of expected future fixations on all AOIs (rows), given the participant just fixated on any individual AOI (column). Note that the entries in the SR matrix are not probabilities. They are (discounted, expected) numbers of visits, and thus, the sum across each column of the ideal SR matrix equals 
1 + γ + γ 2 + = 1 1 γ 1 .
(2)
1 provides additional technical details. 
To summarize, given parameters α and γ, the algorithm produced one 10 × 10 SR matrix per participant per trial. Averaging across the 28 trials for each participant, we were left with 35 individual matrices. Each matrix summarized the eye fixation patterns of the corresponding participant. To reduce the dimensionality of the space, we performed a principal component analysis (PCA; Everitt & Dunn, 2001) of the successor representations. Each SR matrix was reshaped to a vector of 100 features. The whole data set occupied a matrix of size 35 × 100. Following standard PCA practice, we rescaled each feature (column) so that it had zero mean and unit variance across the 35 participants. The first 20 principal components retained over 90% of the variance in the SR data. Conceptually, these components represent dimensions of individual differences in fixation patterns. They are expressed mathematically as orthogonal basis vectors in the 100-dimensional SR space. Each participant was characterized by 20 projections onto this rotated basis. 
Finally, the cumulative Raven score (i.e., the number of correct responses) of each participant was introduced as the target variable of a hierarchical linear regression analysis. The SR projections entered as predictor variables. 
We also compared the novel SR method to several regression models with traditional predictors based on AOI dwell times. Following Vigneau et al. (2006), we explored the following variables: proportional time on matrix (PTM = the dwell time on the matrix area divided by the overall latency), latency to first toggle (FT = the time stamp of the first saccade to the response area), overall latency on easy items (LEz), the number of toggles on easy items (NT), the toggle rate on easy items (TR = NT divided by item latency), and matrix time distribution index (MTDI = the proportional dwell time on cells 1, 2, 4, and 6 minus the proportional dwell time on cells 3, 6, 7, 8, and 9). An item was defined as “easy” if at least 80% of the participants answered it correctly (Vigneau et al., 2006). The first 5 items in each of our test sets met this criterion in our data. PTM, FT, and MTDI were averaged across all 28 items and LEz, NT, and TR across the 10 easy items. Thus, each participant was characterized by 6 measures, which were then used to predict their Raven score. 
Results and discussion
The Raven scores varied between 12 and 27 across the 35 participants (M = 21.9, SD = 3.7). 2 We performed a hierarchical linear regression to assess how much of this variance can be explained on the basis of the SR principal component projections. Two of the projections correlated very strongly with the scores, whereas the third best predictor was insignificant. Therefore, we used two predictors in all regressions. We implemented a two-tier algorithm to maximize the fit to the Raven scores. In the inner loop, it calculated the SR matrices for given parameters α and γ (Equation 1), then calculated the first 20 principal components and the corresponding projections for each participant, picked the two projections that correlated most strongly with the scores, and constructed a linear regression model with these two predictors. In the outer loop, a Nelder–Mead optimization routine searched for α and γ that maximized the multiple regression coefficient of the inner loop model. The best fit (R 2 = 0.56) was achieved with learning rate α* = 0.233 and discount factor γ* = 0.255. Figure 2d reports this optimal model. To our knowledge, this is the most accurate prediction of Raven scores based on eye-tracking data reported to date. 
Figure 2
 
Principal components, weight matrix, and Raven score prediction for the optimal model described in the text. (a) The first component captures the tendency to scan the problem matrix row by row (as indicated by the 3 × 3 clusters of positive values along the diagonal), whereas (b) the second component penalizes the tendency to toggle to the response area (as indicated by the negative values in the last row). The prediction weight matrix (c) is the sum of these two components, scaled by their respective regression coefficients. The x-axis represents the sender area of interest (AOI) and the y-axis represents the receiver AOI. (d) The predicted versus observed Raven scores for all 35 participants (R 2 = 0.56).
Figure 2
 
Principal components, weight matrix, and Raven score prediction for the optimal model described in the text. (a) The first component captures the tendency to scan the problem matrix row by row (as indicated by the 3 × 3 clusters of positive values along the diagonal), whereas (b) the second component penalizes the tendency to toggle to the response area (as indicated by the negative values in the last row). The prediction weight matrix (c) is the sum of these two components, scaled by their respective regression coefficients. The x-axis represents the sender area of interest (AOI) and the y-axis represents the receiver AOI. (d) The predicted versus observed Raven scores for all 35 participants (R 2 = 0.56).
In addition to providing an accurate prediction of Raven scores, the two scanpath SR principal components selected for the regression had clear interpretations with respect to participants' strategies. Figure 2a shows the first component, which accounted for the largest proportion (31%) of the variance in the scores. It was also the first PCA component, capturing the strongest individual differences in eye movement patterns. This component is characterized by a prominent diagonal “box” structure (Figure 2a). The 3 × 3 red boxes indicate the benefits of systematically scanning within a given row of the problem matrix as opposed to haphazard scanning or column-wise scanning. The positive (red) values “dripping” from each box indicate systematic integration as participants moved from row to row. 
The second component, which accounted for another 25% of the variance in the scores, is dominated by a solid blue line across the response area (Figure 2b). We interpret this solid blue area as an “anti-toggle” component. That is, participants who made fewer toggles from each cell of the problem matrix to the response area achieved higher scores than participants who toggled more frequently. 
Figure 3 illustrates these two strategies on synthetic data. We generated 28 sequences according to the systematic strategy. Each sequence began with 50 fixations within the 3 AOIs on the first row, followed by 50 fixations within the second row, 50 fixations within the third row, and a few fixations to and from the response area. Figure 3a plots one of those sequences. We calculated 28 SR matrices from these sequences using the optimal parameters α* and γ*. Figure 3c plots the average of these matrices. It represents a “pseudo-observer” who consistently follows the systematic strategy on all trials. The diagonal box structure is clearly visible (cf. Figure 2a). Note that the cells along the main diagonal have positive values even though the fixation sequences contained no transitions from any AOI directly back to itself. This illustrates an important difference between the successor representation and a transition probability matrix. Despite the absence of immediate repetitions in the sequence, there are plenty of round-trip scanpaths, which give rise to the positive SR values along the diagonal. We also generated 28 sequences of length 150 according to the toggling strategy. They contained multiple transitions to and from the response area (Figure 3b). The corresponding trial-averaged SR matrix (Figure 3d) has high values along the bottom and right edges, corresponding to scanpaths ending in and starting from R, respectively. Figures 3f and 3g plot the deviations from the grand mean (Figures 3e). This approximates the PCA algorithm, which reorganizes the variance of the individual feature vectors. As our simplified illustration has only two cases, both patterns merge into a single “pseudo-component” that merely changes sign. The behavioral data set contained 35 cases whose SR matrices mixed the systematic pattern with the toggling pattern (and other idiosyncratic patterns) in different proportions. The SR projections quantify the degree to which these two strategies are expressed in the scanpaths of each individual participant. The systematic projection was positively correlated with the Raven scores, whereas the toggling projection was negatively correlated. 
Figure 3
 
Synthetic data illustrating the systematic and toggling strategies and their respective successor representations (SRs). Sample fixation sequences generated according to the (a) systematic and (b) toggling strategies. (c, d) The corresponding SR matrices, each averaged across 28 replications. The diagonal box structure in (c) reflects the row-by-row scanning pattern in (a), whereas the bottom-heavy matrix in (d) reflects the toggles to the response area. The matrix in (e) is the mean of (c) and (d). (f, g) The deviations from the mean—hence, the negative (blue) values. Compare with Figure 2.
Figure 3
 
Synthetic data illustrating the systematic and toggling strategies and their respective successor representations (SRs). Sample fixation sequences generated according to the (a) systematic and (b) toggling strategies. (c, d) The corresponding SR matrices, each averaged across 28 replications. The diagonal box structure in (c) reflects the row-by-row scanning pattern in (a), whereas the bottom-heavy matrix in (d) reflects the toggles to the response area. The matrix in (e) is the mean of (c) and (d). (f, g) The deviations from the mean—hence, the negative (blue) values. Compare with Figure 2.
Prior studies have attempted to characterize the constructive matching and response elimination strategies with more traditional dwell time variables. The previous high-water mark was set by Vigneau et al. (2006), who reported R 2 = 0.51 (corrected down to 0.48) for predicting Raven scores with a linear combination of the matrix time distribution index (defined in the Methods section), the number of toggles on easy items, and the latency on easy items. When applied to our data, however, these variables achieved a much lower uncorrected R 2 = 0.16 (Table 1). The most that can be achieved with linear regression on any 3 dwell time predictors on our data is R 2 = 0.21 (Table 1, bottom row). 
Table 1
 
Goodness-of-fit R 2 and leave-one-out cross-validated R cv 2 for predicting Raven scores from eye movement data. The top line reports the performance of the novel method based on successor representations and principal component analysis (PCA). It is compared to some prominent dwell time variables from the literature (Vigneau et al., 2006) and to first- and second-order transition probability matrices.
Table 1
 
Goodness-of-fit R 2 and leave-one-out cross-validated R cv 2 for predicting Raven scores from eye movement data. The top line reports the performance of the novel method based on successor representations and principal component analysis (PCA). It is compared to some prominent dwell time variables from the literature (Vigneau et al., 2006) and to first- and second-order transition probability matrices.
R 2 R cv 2
Successor representation (with PCA) 0.56 0.41
Variables used by Vigneau et al. (2006)
    Proportional time on matrix (PTM) 0.17 0.09
    Latency to first toggle (FT) 0.02 0.01
    Latency on easy items (LEz) 0.11 0.04
    Number of toggles on easy items (NT) 0.01 0.00
    Toggle rate on easy items (TR) 0.12 0.04
    Matrix time distribution index (MTDI) 0.02 0.01
    Vigneau et al.'s model (MTDI + NT + LEz) 0.16 0.03
    Best traditional model (PTM + TR + LEz) 0.21 0.09
Transition probability matrices (with PCA)
    First-order transitions, 2 components 0.29 0.01
    First-order transitions, 4 components 0.51 0.07
    Second-order transitions, 2 components 0.42 0.19
    Second-order transitions, 4 components 0.57 0.26
Apparently, as Vigneau et al. (2006) acknowledge, these methods of quantifying eye movement data are noisy and thus susceptible to overfitting. This begs the question of how well the scanpath SR would perform on new data. We conducted leave-one-out cross-validation to test the generalization performance of our method. We partitioned the data into a training set of 34 participants and a test set of 1 participant. We ran our two-tier algorithm on the training set. The parameters α and γ optimized on the training set were then used to calculate the SR matrix for the fixation sequences in the test set. Finally, we calculated the model's prediction of the test Raven score by multiplying the test SR matrix by the weight matrix estimated from the training set. We repeated this process 35 times, testing on the data from each participant in turn. This produced 35 predicted scores, each one based on a model that had no access to the data that were subsequently used to test it. The squared correlation between these cross-validated predictions and the observed scores was R cv 2 = 0.41. This is a much better estimate of generalization performance than the goodness-of-fit R 2 on the training set (Haykin, 2009). The latter is inflated because it reflects not only the genuine regularities in the population, which will generalize to new cases, but also the idiosyncrasies of the training sample, which will not. This explains the drop from R 2 = 0.56 to R cv 2 = 0.41. Note that this still is very respectable cross-validated performance, which sets a new benchmark for Raven score prediction. For comparison, the corresponding values for the best model based on dwell time variables were R 2 = 0.21 and R cv 2 = 0.09 (Table 1). This suggests that the SR algorithm can extract reliable regularities from the data much better than traditional dwell time methods. The SR advantage comes from the sequential information in scanpaths and from the data-smoothing properties of the temporal-difference learning algorithm. 
The success of the scanpath SR in cross-validated prediction is also a direct result of the stability of the principal components across folds. The same two components—systematicity and toggle—were chosen on all 35 cross-validation folds and were qualitatively indistinguishable from iteration to iteration. Although it is difficult to quantify the component overlap across folds because the two components sometimes switched places, the weight matrices derived from them can be combined linearly. The average weight matrix is shown in Figure 4a and is virtually identical to the weight matrix from the global model trained on all data (Figure 2c). This suggests that the components were not driven by outliers and reflect genuine dimensions of individual differences in scanpath patterns across the majority of observers. The optimized SR parameter values were also quite stable across the 35 folds: mean α = 0.236 (SD = 0.02), mean γ = 0.259 (SD = 0.05). The stability of the temporal discount factor γ suggests that the scanpath patterns have regularities with a characteristic time scale. 
Figure 4
 
Leave-one-out cross-validation results. The average weight matrix (a) across 35 leave-one-out fits is virtually identical to the weight matrix produced by the fit to all data at once (Figure 2c). Each Raven score was predicted by a separate model that had no access to the data for the respective individual. The squared correlation between the cross-validated predictions and the observed scores was R cv 2 = 0.41.
Figure 4
 
Leave-one-out cross-validation results. The average weight matrix (a) across 35 leave-one-out fits is virtually identical to the weight matrix produced by the fit to all data at once (Figure 2c). Each Raven score was predicted by a separate model that had no access to the data for the respective individual. The squared correlation between the cross-validated predictions and the observed scores was R cv 2 = 0.41.
Finally, we compared the new scanpath SR method to first- and second-order transition probability matrices (Table 1). We began by calculating the first-order transition matrix for each sequence. Averaging across 28 trials produced one 10 × 10 matrix per participant. After reshaping, the first-order data set occupied a matrix of the same size (35 × 100) as the SR data set and was analyzed and cross-validated in the same way. The first 20 principal components retained 89% of the variance in the first-order data. Hierarchical linear regression with 2 components yielded R 2 = 0.29 on the full training set but did not cross-validate (R cv 2 = 0.01). Adding variables to the regression model improved the fit only marginally (e.g., R cv 2 = 0.07 with 4 components). This suggests that first-order transition matrices are too myopic to support robust prediction of Raven scores. It also demonstrates that the excellent performance of the SR method cannot be attributed to the PCA-based dimensionality reduction algorithm. 
Second-order transition probabilities are conditionalized on the two preceding fixations in the sequence. This calls for the estimation of a 10 × 10 × 10 matrix per trial. Given that the median (clipped) sequence length was only 88, the second-order estimates were extremely variable even after averaging across the 28 trials. Still, it was interesting to check whether the PCA algorithm could identify individual differences among the participants. After reshaping, the second-order data set occupied a matrix of size 35 × 1000 and the first 20 principal components retained 74% of the variance. Hierarchical linear regression with the second-order projections yielded good fits to the Raven scores (Table 1). The best generalizability (R cv 2 = 0.26) was achieved with 4 predictor variables. While quite respectable and much better than the R cv 2 achievable with traditional measures, this falls far short of the SR-based prediction. Moreover, unlike the SR-based components (Figure 2), the second-order components were extremely hard to visualize and interpret. 
The transition-based results suggest two conclusions. First, a single-step event horizon cannot capture the statistical regularities in our data. A temporally extended analysis seems necessary. This explains why the second-order model performed better than the first-order one. The SR-based model performed the best, due in large part to its open-ended event horizon whose effective size was controlled adaptively by the γ parameter. The second conclusion is that the probability estimates need to be smoothed. There are not enough data to populate the matrices by simple counting, particularly in the second-order case. This scarcity of data (rather than computational constraints) appears to be the limiting factor in scanpath analysis in general. The SR learning algorithm (Equation 1) updates a whole column of the matrix after each transition, thereby smoothing the estimates. Stated differently, each cell in the SR matrix aggregates a whole class of observations. For example, cell (1, 1) would be updated after observing any of the following subsequences: 121, 131, …, 1R1; 1231, 1241, …. This reuses the data and reduces the variance of the estimates. This smoothing effect contributed to the stability of the SR components during leave-one-out cross-validation. By contrast, the first-order probability estimates were apparently noisier, and the PCA solution was unstable even though it involved matrices of the same shape estimated from the same data. 
General discussion
Our novel method of eye movement analysis, the scanpath successor representation (SR), produced new results in terms of both successful score prediction and insight into individual differences in problem solving strategies on Raven's Advanced Progressive Matrices. With this method, we were able to extract the underlying structure from complex patterns of sequential eye movements during geometric problem solving. These regularities allowed us to predict APM scores with unprecedented accuracy. More importantly, the principal component analysis of the successor representations produced components that were readily interpretable and consistent with earlier strategy findings. 
The two components of the scanpath SRs that correlated strongly with the scores mapped clearly onto the two main processing strategies for multiple-choice matrix completion problems. The anti-toggle component (Figure 2b) replicated earlier reports of negative correlations between toggling and Raven scores (Bethell-Fox et al., 1984; Carpenter, Just, & Shell, 1990; Vigneau et al., 2006). This qualitative agreement with established results validates the new SR-based method. Quantitatively, however, it goes a step further because it could predict a larger proportion of the variance compared to traditional measures such as the number of toggles or toggle rate. This suggests that the SR-based analysis provides a more sensitive measure of toggling and thus can better identify individuals who follow the response elimination strategy. This article did not address the question of whether response elimination is adopted at the beginning of a problem or only as a fallback strategy on difficult items. This question can be answered by analyzing the SR matrices for individual trials and/or contrasting the early and late portions of the fixation sequences within trials. 
The systematicity component (Figure 2a) is a novel finding and arguably provides the most detailed picture of Raven performance and strategic processing to date. This component demonstrates the importance of processing the problem matrix row by row. Within rows, there was also evidence that integrating cell information is more successful if it is attained by scanning adjacent cells (1→2, 2→3, 3→2) as opposed to skipping over cells (1→3, 4→6). This suggests that row scanning (particularly adjacent cell scanning within rows) is more likely to generate relational insight, which conforms to previous findings that perceptual motor patterns can increase the likelihood of rule insight (Grant & Spivey, 2003; Thomas & Lleras, 2007). This lends new support to the theory that successful Raven solvers use a constructive matching strategy and explicates some important aspects of this strategy. 
We chose Raven's APM as the test bed for the novel scanpath SR method because decades of painstaking research have identified the two strategies most relevant for this domain (e.g., Bethell-Fox et al., 1984; Carpenter et al., 1990; Snow, 1980; Vigneau et al., 2006). Thus, we knew what to expect and could validate the method against these established findings. Still, the method revealed previously unknown details about the constructive matching strategy. More importantly, armed with this powerful tool, we could have discovered these two strategies even if we had never read the Raven literature, simply by interpreting the component matrices in Figure 2. Note that these matrices were calculated in an entirely automated manner and reflect regularities in the data rather than the prior knowledge of the authors. Thus, the scanpath SR method promises to be a great tool for exploratory data analysis, with the potential for rapid discoveries in other domains. 
The power of the scanpath SR stems from the fact that it extends the event horizon of sequential eye movements to extract temporally extended patterns. It will very likely prove useful in any complex task environment that has distinct areas of interest (statically or dynamically defined). This includes other abstract, rule-governed environments such as chess (Charness, Reingold, Pomplun, & Stampe, 2001) or Tower of Hanoi (Patsenko & Altmann, 2010) but also practical applications such as identifying successful and unsuccessful strategies for landing a plane (Anders, 2001; Ottati, Hickox, & Richter, 1999), driving a car (Crundall, Underwood, & Chapman, 1998), or performing laparoscopic surgery (Nicolaou, James, Darzi, & Yang, 2004). 
Appendix A
Technical details and potential improvements
The successor representation was introduced to the reinforcement learning literature by Dayan (1993) and was developed by White (1995). The SR is essentially identical to the fundamental matrix in the theory of Markov chains (Kemeny & Snell, 1976). More recently, Gershman, Moore, Todd, Norman, and Sederberg (under revision) identified a formal connection between the SR and an influential model of episodic and semantic memory, the Temporal Context Model (e.g., Howard & Kahana, 2002; Sederberg, Howard, & Kahana, 2008). 
We use a version of the successor representation that differs slightly from the standard definition (Dayan, 1993; White, 1995). The difference is that, when visiting a state i, our version does not include this same visit in the total (temporally discounted) number of visits to i. Assuming a first-order Markov chain with transition probability matrix T , our SR matrix M is based on the power series: 
M = T + γ T 2 + γ 2 T 3 + = T ( I γ T ) 1 .
(A1)
 
The standard definition (Dayan, 1993; White, 1995) is based on the power series I + γ T + γ 2 T 2 + … = ( I γ T )−1. To revert to the standard formulation of the SR learning algorithm, the term I j in Equation 1 must be replaced by I i . In the special case when γ = 0, our algorithm tracks the transition matrix T instead of the identity matrix I
The proof that the temporal-difference learning algorithm in Equation 1 converges to the true successor representation M (White, 1995) is a direct application of more general convergence proofs about TD(λ) learning in the reinforcement learning literature (Dayan, 1992; Jaakkola, Jordan, & Singh, 1994; Sutton, 1988). To ensure convergence, it is necessary to decrease the learning rate α as the data accumulate. The technical conditions include 
n = 0 α n = a n d n = 0 α n 2 < ,
(A2)
where n is the number of observations (Dayan & Sejnowski, 1993, cited in White, 1995). 
This indicates that the learning rate should be inversely related to the length of the data sequence. This in turn suggests a potential improvement of our eye-tracking analysis application. In the present article, we used a fixed α for all sequences regardless of length. It would be interesting to explore parameterizations that reduce the effective learning rate for longer sequences. The clipping of sequences longer than 100 fixations (described in the Methods section) is a crude way of regularizing the sequence length. Our present results indicate that, even with a fixed learning rate, the learning algorithm can accommodate substantial variability in length. As mentioned earlier, this is a major advantage over string-editing methods for comparing scanpaths. Varying the learning rate as a function of sequence length will provide additional robustness and reduce the variance of the estimates. This is a promising topic for future research. 
Another promising possibility is to improve the feature selection algorithm. Independent Component Analysis (ICA; Stone, 2004) may be better suited for eye-tracking applications than PCA because it relaxes the orthogonality constraint on the components. The SR matrices that correspond to psychologically relevant strategies are not necessarily orthogonal. 
Acknowledgments
The authors thank James Todd and Vladimir Sloutsky for valuable suggestions on the manuscript. 
Commercial relationships: none. 
Corresponding author: Alexander A. Petrov. 
Email: apetrov@alexpetrov.com. 
Address: Department of Psychology, 200B Lazenby Hall, Ohio State University, Columbus, OH 43210, USA. 
Footnotes
Footnotes
1  There was a significant practice effect within subjects, but it did not interact significantly with the between-subject manipulation. The posttest score was 1.5 points higher on average than the pretest score (t(34) = 3.48, p < 0.001), replicating published results (Bors & Vigneau, 2003; Denney & Heidrich, 1990).
Footnotes
2  The mean scores (and SDs) for the two 14-item subsets were 10.7 (2.8) and 11.2 (1.8). The subsets were counterbalanced across the first (M = 10.2, SD = 2.5) and second (M = 11.7, SD = 2.0) sessions.
References
Anders G. (2001). Pilot's attention allocation during approach and landing: Eye- and head-tracking research in an A330 full flight simulator. In Jensen R. (Ed.), Proceedings of the 11th International Symposium on Aviation Psychology (pp. 1–6). Mahwah, NJ: Lawrence Erlbaum Associates.
Bethell-Fox C. E. Lohman D. F. Snow R. E. (1984). Adaptive reasoning: Componential and eye movement analysis of geometric analogy performance. Intelligence, 8, 205–238. [CrossRef]
Bors D. A. Vigneau F. (2003). The effect of practice on Raven's advanced progressive matrices. Learning and Individual Differences, 13, 291–312. [CrossRef]
Brandt S. A. Stark L. W. (1997). Spontaneous eye movements during visual imagery reflect the content of the visual scene. Journal of Cognitive Neuroscience, 9, 27–38. [CrossRef] [PubMed]
Brouwers S. A. Van de Viver F. J. R. Van Hemert D. A. (2009). Variation in Raven's progressive matrices scores across time and place. Learning and Individual Differences, 19, 330–338. [CrossRef]
Buswell G. T. (1935). How people look at pictures. Chicago: University of Chicago Press.
Cagli R. C. Coraggio P. Napoletano P. Boccignone G. (2008). What the draughtsman's hand tells the draughtsman's eye: A sensorimotor account of drawing. International Journal of Pattern Recognition and Artificial Intelligence, 22, 1015–1029. [CrossRef]
Carpenter P. A. Just M. A. Shell P. (1990). What one intelligence test measures: A theoretical account of the processing in the Raven Progressive Matrices test. Psychological Review, 97, 404–431. [CrossRef] [PubMed]
Charness N. Reingold E. M. Pomplun M. Stampe D. M. (2001). The perceptual aspect of skilled performance in chess: Evidence from eye movements. Memory and Cognition, 29, 1146–1152. [CrossRef] [PubMed]
Crundall D. E. Underwood G. Chapman P. R. (1998). How much do novice drivers see? The effects of demand on visual search strategies in novice and experienced drivers. In Underwood G. (Ed.), Eye guidance in reading and scene perception (pp. 395–418). Amsterdam, The Netherlands: Elsevier.
Dayan P. (1992). The convergence of TD(λ) for general λ . Machine Learning, 8, 341–362.
Dayan P. (1993). Improving generalization for temporal difference learning: The successor representation. Neural Computation, 5, 613–624. [CrossRef]
Dayan P. Sejnowski T. J. (1993). TD(λ) converges with probability 1 (Tech. Rep.). San Diego, CA: CNL, The Salk Institute.
Denney N. W. Heidrich S. M. (1990). Training effects on Raven's progressive matrices in young, middle-aged, and elderly adults. Psychology and Aging, 5, 144–145. [CrossRef] [PubMed]
Ellis S. R. Stark L. (1986). Statistical dependency in visual scanning. Human Factors, 28, 421–438. [PubMed]
Ericsson K. A. Simon H. A. (1993). Protocol analysis: Verbal reports as data (Rev. ed.). Cambridge, MA: MIT Press.
Eslinger P. J. Blair C. Wang J. L. Lipovsky B. Realmuto J. Baker D. et al. (2009). Developmental shifts in fMRI activations during visuospatial relational reasoning. Brain and Cognition, 69, 1136–1149. [CrossRef]
Everitt B. S. Dunn G. (2001). Applied multivariate analysis. New York: Oxford University Press.
Findlay J. M. Gilchrist I. D. (2003). Active vision: The psychology of looking and seeing. Oxford, UK: Oxford University Press.
Gershman S. J. Moore C. D. Todd M. T. Norman K. A. Sederberg P. B. (under revision). The successor representation and temporal context.
Grant E. R. Spivey M. J. (2003). Eye movements and problem solving: Guiding attention guides thought. Psychological Science, 14, 462–466. [CrossRef] [PubMed]
Gray J. R. Chabris C. F. Braver T. S. (2003). Neural mechanisms of general fluid intelligence. Nature Neuroscience, 6, 316–322. [CrossRef] [PubMed]
Haykin S. (2009). Neural networks and learning machines (3rd ed.). New York: Prentice Hall.
Howard M. W. Kahana M. J. (2002). A distributed representation of temporal context. Journal of Mathematical Psychology, 46, 269–299. [CrossRef]
Jaakkola T. Jordan M. I. Singh S. P. (1994). On the convergence of stochastic iterative dynamic programming algorithms. Neural Computation, 6, 1185–1201. [CrossRef]
Jansen A. R. Marriott K. Yelland G. W. (2007). Parsing of algebraic expressions by experienced users of mathematics. European Journal of Cognitive Psychology, 19, 286–320. [CrossRef]
Just M. A. Carpenter P. A. (1985). Cognitive coordinate systems: Accounts of mental rotation and individual differences in spatial ability. Psychological Review, 92, 137–172. [CrossRef] [PubMed]
Kemeny J. G. Snell J. L. (1976). Finite Markov chains. New York: Springer.
Matzen L. E. Benz Z. O. Dixon K. R. Posey J. Kroger J. K. Speed A. E. (2010). Recreating Raven's: Software for systematically generating large numbers of Raven-like matrix problems with normed properties. Behavioral Research Methods, 42, 525–541. [CrossRef]
Myers C. W. Schoelles M. J. (2005). ProtoMatch: A tool for analyzing high-density, sequential eye gaze and cursor protocols. Behavior Research Methods, 37, 256–270. [CrossRef] [PubMed]
Nicolaou M. James A. Darzi A. Yang G. (2004). A study of saccade transition for attention segregation and task strategy in laparoscopic surgery. In Barillot C. Haynor, D. R. Hellier P. (Eds.), Medical image computing and computer-assisted intervention (vol. 3217, pp. 97–104). Berlin, Germany: Springer.
Ottati W. L. Hickox J. C. Richter J. (1999). Eye scan patterns of experienced and novice pilots during visual flight rules (VFR) navigation. In Proceedings of the Human Factors and Ergonomics Society 43rd Annual Meeting (vols. 1 and 2, pp. 66–70). Santa Monica, CA: Human Factors and Ergonomics Society.
Patsenko E. G. Altmann E. M. (2010). How planful is routine behavior? A selective-attention model of performance in the Tower of Hanoi. Journal of Experimental Psychology General, 139, 95–116. [CrossRef] [PubMed]
Ponsoda V. Scott D. Findlay J. M. (1995). A probability vector and transition matrix analysis of eye movements during visual search. Acta Psychologica, 88, 167–185. [CrossRef] [PubMed]
Rabiner L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77, 257–286. [CrossRef]
Raven J. C. Raven J. Court J. H. (1998). Manual for Raven's progressive matrices and vocabulary scales. Section 4: Advanced progressive matrices. San Antonio, TX: Pearson.
Rayner K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124, 372–422. [CrossRef] [PubMed]
Rimey R. D. Brown C. M. (1991). Controlling eye movements with hidden Markov models. International Journal of Computer Vision, 7, 47–65. [CrossRef]
Salvucci D. D. Anderson J. R. (2001). Automated eye-movement protocol analysis. Human-Computer Interaction, 16, 39–86. [CrossRef]
Scott S. L. (2002). Bayesian methods for hidden Markov models: Recursive computing in the 21st century. Journal of the American Statistical Association, 97, 337–351. [CrossRef]
Sederberg P. B. Howard M. W. Kahana M. J. (2008). A context-based theory of recency and contiguity in free recall. Psychological Review, 115, 893–912. [CrossRef] [PubMed]
Simola J. Kojo I. (2008). Using hidden Markov model to uncover processing states from eye movements in information search tasks. Cognitive Systems Research, 9, 237–251. [CrossRef]
Snow R. E. (1980). Aptitude processes. In Snow, R. E. Federico, P. A. Montague W. E. (Eds.), Aptitude, learning, and instruction: Vol. 1. Cognitive process analyses of aptitude (pp. 27–63). Hillsdale, NJ: Erlbaum.
Soulieres I. Dawson M. Samson F. Barbeau E. B. Sahyoun C. P. Strangman G. E. et al. (2009). Enhanced visual processing contributes to matrix reasoning in autism. Human Brain Mapping, 30, 4082–4107. [CrossRef] [PubMed]
SR Research (2006). Eyelink 1000 user's manual. Mississauga, ON, Canada: SR Research.
Stark L. Ellis S. R. (1981). Scanpath revisited: Cognitive models of direct active looking. In Fisher, D. F. Monty, R. A. Senders J. W. (Eds.), Eye movements: Cognition and visual perception (pp. 193–226). Hillsdale, NJ: Lawrence Erlbaum Associates.
Stone J. V. (2004). Independent component analysis: A tutorial introduction. Cambridge, MA: MIT Press.
Sutton R. S. (1988). Learning to predict by the method of temporal differences. Machine Learning, 3, 9–44.
Sutton R. S. Barto A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press.
Thomas L. E. Lleras A. (2007). Moving eyes and moving thought: On the spatial compatibility between eye movements and cognition. Psychonomic Bulletin & Review, 14, 663–668. [CrossRef] [PubMed]
van der Lans R. Pieters R. Wedel M. (2008). Eye-movement analysis of search effectiveness. Journal of the American Statistical Association, 103, 452–461. [CrossRef]
Vigneau F. Caissie A. F. Bors D. A. (2006). Eye-movement analysis demonstrates strategic influences on intelligence. Intelligence, 34, 261–272. [CrossRef]
White L. M. (1995). Temporal difference learning: Eligibility traces and the successor representation for actions. Unpublished master's thesis, Department of Computer Science, University of Toronto, Canada.
Yarbus A. L. (1967). Eye movements in vision (L. A. Riggs, Trans.). New York: Plenum Press.
Figure 1
 
Example of the Raven problem format and trial sequence. (Left) The problem matrix and the 8 response alternatives are shown with solid lines. The height of the rectangular box around the matrix subtended 9 degrees of visual angle. Eye fixations were assigned to 10 areas of interest (AOIs) as indicated by dotted lines: nine for the matrix cells (top row = 1–3, middle = 4–6, bottom = 7–9) and one for the entire response area. (Right) Each trial had three phases: fixation, solution, and response. Participants fixated for 1 s. Eye movements and verbal protocols were collected during the solution phase. Moving the mouse cursor out of the fixation box triggered the response phase, during which the problem matrix was masked and the participant clicked on their chosen answer. The intertrial interval (ITI) was 200 ms. (This problem was generated by the authors to protect the security of the standardized test.)
Figure 1
 
Example of the Raven problem format and trial sequence. (Left) The problem matrix and the 8 response alternatives are shown with solid lines. The height of the rectangular box around the matrix subtended 9 degrees of visual angle. Eye fixations were assigned to 10 areas of interest (AOIs) as indicated by dotted lines: nine for the matrix cells (top row = 1–3, middle = 4–6, bottom = 7–9) and one for the entire response area. (Right) Each trial had three phases: fixation, solution, and response. Participants fixated for 1 s. Eye movements and verbal protocols were collected during the solution phase. Moving the mouse cursor out of the fixation box triggered the response phase, during which the problem matrix was masked and the participant clicked on their chosen answer. The intertrial interval (ITI) was 200 ms. (This problem was generated by the authors to protect the security of the standardized test.)
Figure 2
 
Principal components, weight matrix, and Raven score prediction for the optimal model described in the text. (a) The first component captures the tendency to scan the problem matrix row by row (as indicated by the 3 × 3 clusters of positive values along the diagonal), whereas (b) the second component penalizes the tendency to toggle to the response area (as indicated by the negative values in the last row). The prediction weight matrix (c) is the sum of these two components, scaled by their respective regression coefficients. The x-axis represents the sender area of interest (AOI) and the y-axis represents the receiver AOI. (d) The predicted versus observed Raven scores for all 35 participants (R 2 = 0.56).
Figure 2
 
Principal components, weight matrix, and Raven score prediction for the optimal model described in the text. (a) The first component captures the tendency to scan the problem matrix row by row (as indicated by the 3 × 3 clusters of positive values along the diagonal), whereas (b) the second component penalizes the tendency to toggle to the response area (as indicated by the negative values in the last row). The prediction weight matrix (c) is the sum of these two components, scaled by their respective regression coefficients. The x-axis represents the sender area of interest (AOI) and the y-axis represents the receiver AOI. (d) The predicted versus observed Raven scores for all 35 participants (R 2 = 0.56).
Figure 3
 
Synthetic data illustrating the systematic and toggling strategies and their respective successor representations (SRs). Sample fixation sequences generated according to the (a) systematic and (b) toggling strategies. (c, d) The corresponding SR matrices, each averaged across 28 replications. The diagonal box structure in (c) reflects the row-by-row scanning pattern in (a), whereas the bottom-heavy matrix in (d) reflects the toggles to the response area. The matrix in (e) is the mean of (c) and (d). (f, g) The deviations from the mean—hence, the negative (blue) values. Compare with Figure 2.
Figure 3
 
Synthetic data illustrating the systematic and toggling strategies and their respective successor representations (SRs). Sample fixation sequences generated according to the (a) systematic and (b) toggling strategies. (c, d) The corresponding SR matrices, each averaged across 28 replications. The diagonal box structure in (c) reflects the row-by-row scanning pattern in (a), whereas the bottom-heavy matrix in (d) reflects the toggles to the response area. The matrix in (e) is the mean of (c) and (d). (f, g) The deviations from the mean—hence, the negative (blue) values. Compare with Figure 2.
Figure 4
 
Leave-one-out cross-validation results. The average weight matrix (a) across 35 leave-one-out fits is virtually identical to the weight matrix produced by the fit to all data at once (Figure 2c). Each Raven score was predicted by a separate model that had no access to the data for the respective individual. The squared correlation between the cross-validated predictions and the observed scores was R cv 2 = 0.41.
Figure 4
 
Leave-one-out cross-validation results. The average weight matrix (a) across 35 leave-one-out fits is virtually identical to the weight matrix produced by the fit to all data at once (Figure 2c). Each Raven score was predicted by a separate model that had no access to the data for the respective individual. The squared correlation between the cross-validated predictions and the observed scores was R cv 2 = 0.41.
Table 1
 
Goodness-of-fit R 2 and leave-one-out cross-validated R cv 2 for predicting Raven scores from eye movement data. The top line reports the performance of the novel method based on successor representations and principal component analysis (PCA). It is compared to some prominent dwell time variables from the literature (Vigneau et al., 2006) and to first- and second-order transition probability matrices.
Table 1
 
Goodness-of-fit R 2 and leave-one-out cross-validated R cv 2 for predicting Raven scores from eye movement data. The top line reports the performance of the novel method based on successor representations and principal component analysis (PCA). It is compared to some prominent dwell time variables from the literature (Vigneau et al., 2006) and to first- and second-order transition probability matrices.
R 2 R cv 2
Successor representation (with PCA) 0.56 0.41
Variables used by Vigneau et al. (2006)
    Proportional time on matrix (PTM) 0.17 0.09
    Latency to first toggle (FT) 0.02 0.01
    Latency on easy items (LEz) 0.11 0.04
    Number of toggles on easy items (NT) 0.01 0.00
    Toggle rate on easy items (TR) 0.12 0.04
    Matrix time distribution index (MTDI) 0.02 0.01
    Vigneau et al.'s model (MTDI + NT + LEz) 0.16 0.03
    Best traditional model (PTM + TR + LEz) 0.21 0.09
Transition probability matrices (with PCA)
    First-order transitions, 2 components 0.29 0.01
    First-order transitions, 4 components 0.51 0.07
    Second-order transitions, 2 components 0.42 0.19
    Second-order transitions, 4 components 0.57 0.26
© 2011 ARVO
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×