The most striking difference in viewing behavior between static and dynamic scenes previously reported is the clustering of gaze across individuals (i.e., attentional synchrony; Mital et al.,
2011; Smith & Henderson,
2008). Various methods have been proposed for expressing the degree of attentional synchrony during a dynamic scene, including entropy (Sawahata et al.,
2008), Kullback-Leibler divergence (Rajashekar, Cormack, & Bovik,
2004), normalized scan path salience (Dorr et al.,
2010; Peters, Iyer, Itti, & Koch,
2005; Taya et al.,
2012), bivariate contour ellipse area (BCEA; Goldstein et al.,
2007; Kosnik, Fikre, & Sekuler,
1986; Ross & Kowler,
2013), and Gaussian mixture modeling (GMM; Mital et al.,
2011; Sawahata et al.,
2008). Each method expresses slightly different properties of the gaze distribution such as assuming all gaze is best expressed as a single diagonal cluster (BCEA), multiple spherical clusters (GMM), or describing the overall distribution (entropy). However, most methods have been shown to express the variation in attentional synchrony and have also been shown to correlate (Dorr et al.,
2010). Here, our interest is in expressing two properties of attentional synchrony: (a) the variance of gaze around a single point and (b) the number of focal points around which gaze is clustered in a particular frame. We therefore decided to use GMM as this represents a collection of unlabeled data points as a mixture of Gaussians each with a separate mean, covariance, and weight parameter. However, this approach requires knowing how many clusters are in the data a priori. Following Sawahata and colleagues (
2008), we can discover the optimal number of clusters that explain a distribution of eye movements using model selection (model selection operates by minimizing the Bayesian information criterion; see Bishop,
2007, for further explanation of the algorithm). Alternatively, the number of clusters can be set a priori. If a single cluster is used, the algorithm will model all gaze points for a particular frame using a single Gaussian kernel approximated by a spherical covariance matrix. The closer the covariance of this Gaussian is to zero, the tighter the gaze clustering and therefore the greater the attentional synchrony. For ease of interpretation, the cluster covariance is expressed as the visual angle enclosing 68% of gaze points (i.e., 1 standard deviation of the full Gaussian spherical covariance matrix). When using a single Gaussian cluster, this measure is very similar to BCEA (Goldstein et al.,
2007), although it does not assume independent horizontal and vertical variances and instead uses a spherical cluster. As such, our measure can be considered a more conservative estimate of attentional synchrony than BCEA. However, analysis of our data with both BCEA and single Gaussian modeling revealed strong significant correlations between the two measures (static+ free viewing:
r = 0.863,
p < 0.001; static+ spot-the-location:
r = 0.888,
p < 0.001; dynamic+ free viewing:
r = 0.823,
p < 0.001; dynamic+ spot-the-location:
r = 0.874,
p < 0.001). GMM was used to allow the second-stage analysis of fitting the minimal number of clusters per frame.