In addition to HMM approaches, there are other analysis methods that can potentially take both temporal and spatial information of eye movements into account. For example, Jack et al. (
2009) analyzed Caucasian and Asian participants' eye movement data in a facial expression judgment task. They analyzed the spatial information of the eye movements by applying a pixel test on a fixation map (heat map) to find the significantly fixated areas and their centroids. In order to establish a set of common ROIs across conditions for comparison, they pooled all the centroids from the fixation maps in different conditions and ran a k-means clustering to find a centroid for each nonoverlapping fixated region. This approach offers data-driven, clearly defined ROIs. Note however that the k-means clustering algorithm assumes that each cluster is isotropic and all clusters have the same spatial size (formally, each cluster is a Gaussian with the same isotropic covariance matrix), and hence the resulting ROIs will tend to be circular and have similar sizes. In contrast, the HMM approach we used here allows each ROI to have a different shape and size (i.e., each ROI is described by a Gaussian with its own covariance matrix). Thus, the HMM approach offers a more flexible representation of ROIs. In addition, the k-means approach requires the number of ROIs to be set explicitly by the experimenter, whereas our Bayesian HMM automatically selects the number of ROIs based on the observed data and the prior. Although the parameters for the Bayesian prior are still selected by the experimenter, the prior only indirectly affects the number of ROIs (in general, the Bayesian HMM will find the same number of ROIs over a range of prior parameters). As for the temporal information of the eye movements, after obtaining the ROIs, Jack et al. (
2009) used the ROIs to describe sequences of fixations as strings, and then used the minimum description length method to extract regular patterns from the data. Note that in describing a fixation sequence as a string, each fixation is assigned to only one ROI (i.e., a “hard” assignment). However, a “hard” assignment may not be suitable for a fixation that is close to the boundary between two ROIs, since a small amount of noise (e.g., due to measurement error in the eye tracker) could have caused the fixation to be categorized differently, resulting in a different string and perhaps confounding the subsequent analysis. In contrast, when estimating the ROIs and transition matrices, our HMM uses “soft” assignments, where each fixation is associated with a posterior probability of belonging to each ROI. Hence, a fixation on a ROI boundary is considered to belong equally to both ROIs, thus reducing potential problems due to these ambiguous fixations. When extracting regular patterns, Jack et al. (
2009) considered fixation patterns from zeroth (single fixation sequences) up to third order (four fixation sequences). In contrast, the transition matrix of the HMM used here only describes the first-order transition probabilities between any two ROIs and does not contain information about the whole sequence. Nevertheless, the HMM approach can be expanded to incorporate higher order transition probabilities (i.e., higher order HMMs). Finally, in Jack et al. (
2009), the analysis of spatial and temporal information of eye movements were conducted as two separate parts, which is suboptimal from an estimation standpoint, whereas in our HMM approach the two types of information are analyzed simultaneously. In addition, our HMM approach is able to take individual differences into account in group/condition comparisons.