Open Access
Article  |   November 2019
The Neyman Pearson detection of microsaccades with maximum likelihood estimation of parameters
Author Affiliations
Journal of Vision November 2019, Vol.19, 17. doi:https://doi.org/10.1167/19.13.17
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Hongzhi Zhu, Septimiu Salcudean, Robert Rohling; The Neyman Pearson detection of microsaccades with maximum likelihood estimation of parameters. Journal of Vision 2019;19(13):17. https://doi.org/10.1167/19.13.17.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Despite the fact that the velocity threshold method is widely applied, the detection of microsaccades continues to be a challenging problem, due to gaze-tracking inaccuracy and the transient nature of microsaccades. Important parameters associated with a saccadic event, e.g., saccade duration, amplitude, and maximum velocity, are sometimes imprecisely estimated, which may lead to biases in inferring the roles of microsaccades in perception and cognition. To overcome the biases and have a better detection algorithm for microsaccades, we propose a novel statistical model for the tracked gaze positions during eye fixations. In this model, we incorporate a parametrization that has been previously applied to model saccades, which allows us to veridically capture the velocity profile of saccadic eye movements. Based on our model, we derive the Neyman Pearson Detector (NPD) for saccadic events. Implemented in conjunction with the maximum likelihood estimation method, our NPD can detect a saccadic event and estimate all parameters simultaneously. Because of its adaptive nature and its statistical optimality, our NPD method was able to better detect microsaccades in some datasets when compared with a recently proposed state-of-the-art method based on convolutional neural networks. NPD also yielded comparable performance with a recently developed Bayesian algorithm, with the added benefit of modeling a more biologically veridical velocity profile of the saccade. As opposed to these algorithms, NPD can lend itself better to online saccade detection, and thus has potential for human-computer interaction applications. Our algorithm is publicly available at https://github.com/hz-zhu/NPD-micro-saccade-detection.

Introduction
Vision is the primary method for us to perceive the world and communicate with each other (Dowling & Dowling, 2016). Recent advancements in noninvasive eye-gaze tracking technologies have further increased interest in vision research, as we can exploit gaze position to gain knowledge about, for instance, human allocation of attention and inform human-computer interaction (HCI) technologies (Chennamma & Yuan, 2013; Ruhland et al., 2015; Zhu, Salcudean, & Rohling, 2019). Yet gaze-tracking for functional and cognitive research of the visual system is challenging because of the gaze-tracking inaccuracies as well as the stochastic nature of eye movements. 
People's eyes are never still, not even during deliberate fixation. In Ciuffreda and Tannen (1995) and Wandell (1995), it is found that our eyes exhibit three types of involuntary movements during fixation: tremor, drift, and microsaccades. The tremor is a high frequency, very low-amplitude movement first discovered in Adler and Fliegelman (1934). It is a highly rhythmical and oscillatory movement with peak frequency around 80 Hz. Eye tracking equipment with very high accuracy is needed to record the tremor (Bengi & Thomas, 1968), and hence, common commercial eye tracking devices cannot capture the existence of tremor successfully (Rolfs, 2009). Drift movements have low amplitudes and low velocities. For prolonged fixations, drift has amplitudes in the range between 1′ and 8′ of visual angle, and velocities usually below 30′ s−1 (Deubel & Bridgeman, 1995; Rolfs, 2009). The microsaccade, which occurs a few times per second, each with a duration spanning from 5–6 ms to about 20–30 ms, is a high-velocity eye movement with very short duration (Engbert, 2006). Microsaccades and saccades generally form a continuum of oculomotor behaviors (Otero-Millan et al., 2013; Zuber, Stark, & Cook, 1965). Commonly in recent years, microsaccades have been differentiated from saccades as the events with amplitude below 1° of vision angle (Sinn & Engbert, 2016). Some microsaccades can be much smaller than that, sometimes coming close to the measurement noise of many eye-tracking devices (for instance, the EyeLink II video-based infrared eye tracker reports a precision of 0.01° of visual angle; Engbert, 2006; Martinez-Conde, Macknik, Troncoso, & Hubel, 2009). Their small size combined with their relatively short duration account partly for the difficulty of microsaccade detection, which represents an ongoing research problem. 
Past research on the topic of detecting a microsaccade can be roughly classified into three categories: feature-based detection methods (FBDMs), statistical detection methods (SDMs), and more recently, data-driven detection methods (DDDMs). The commonly used features by FBDMs are the velocity and acceleration of a microsaccade. In Engbert (2006), the velocity threshold method is proposed to detect microsaccades, and in Otero-Millan et al. (2014), the unsupervised clustering method based on multiple features (peak velocity and acceleration) is used to differentiate a microsaccade from other types of eye movements. A major disadvantage of employing FBDMs is their vulnerability to noise. Therefore, in Sheynikhovich, Bécu, Wu, and Arleo (2018), FBDMs suitable for high-noise regime are proposed. Also, several SDMs have been developed to handle the noise in the tracked eye-gaze positions. Particle filtering, with a small number of effective particles, was proposed to detect (micro)saccades (Daye & Optican, 2014). More recently, the hidden Markov model was proposed in Mihali, van Opheusden, and Ma (2017), and the Bayesian Inference Method (BIM) was derived accordingly as the microsaccade detector. A reliable and robust detection method would inevitably rely on an accurate statistical model. However, due to the differences between individuals, the problem of finding the general statistical distributions for several characteristic quantities of a microsaccade, e.g., maximum velocity, duration, and amplitude, are not fully solved yet. Though based on principled statistical inference, the BIM has the following limitations. First, a constant velocity model for microsaccade is proposed in BIM, which is not the case in real data. Such a model simplifies the calculation at a cost of losing model accuracy. Second, the parameter estimation method in the BIM may lead to biases in the detection results. Last but not least, BIM is effective when a full set of gaze data has already been acquired, yet real-time BIM implementation is difficult. Therefore, BIM detectors are difficult to use in HCI applications. Efforts have also been made in developing data-driven approaches to solve the problem of microsaccade detection. In Bellet, Bellet, Nienborg, Hafed, and Berens (2018), the convolutional neural network (CNN) is trained using labeled gaze signals. To the best of our knowledge, this CNN method is the current state-of-the-art method, providing, according to the authors, “human-level detection performance.” However, in contrast to a process or statistical model, a data-driven approach is agnostic to a possible underlying process and it does not rely on or directly infer microsaccade characteristics. 
To solve the aforementioned limitations of the BIM and the CNN approach, and to better detect a microsaccade during eye fixation, we propose an alternative SDM, the Neyman Pearson Detector (NPD), for microsaccade detection. To elaborate our method, the rest of the article is organized as follows. We propose our statistical model for gaze data during eye fixation in the Statistical modeling of gaze data section. Based on the proposed model, in the Proposed detection method section, we derive NPD for the detection of microsaccades. To test the validity of our model as well as to compare our NPD with BIM and the CNN method, outcomes are presented in the Results section, and last, we conclude the article in the Discussion and conclusion section. 
Statistical modeling of gaze data
In our proposed model, we only employ the well-developed statistics for fixational eye movement. Our proposed model classifies a sequence of gaze data into the following binary states: (a) the state of pure drifting movement, denoted as Display Formula\(\def\upalpha{\unicode[Times]{x3B1}}\)\(\def\upbeta{\unicode[Times]{x3B2}}\)\(\def\upgamma{\unicode[Times]{x3B3}}\)\(\def\updelta{\unicode[Times]{x3B4}}\)\(\def\upvarepsilon{\unicode[Times]{x3B5}}\)\(\def\upzeta{\unicode[Times]{x3B6}}\)\(\def\upeta{\unicode[Times]{x3B7}}\)\(\def\uptheta{\unicode[Times]{x3B8}}\)\(\def\upiota{\unicode[Times]{x3B9}}\)\(\def\upkappa{\unicode[Times]{x3BA}}\)\(\def\uplambda{\unicode[Times]{x3BB}}\)\(\def\upmu{\unicode[Times]{x3BC}}\)\(\def\upnu{\unicode[Times]{x3BD}}\)\(\def\upxi{\unicode[Times]{x3BE}}\)\(\def\upomicron{\unicode[Times]{x3BF}}\)\(\def\uppi{\unicode[Times]{x3C0}}\)\(\def\uprho{\unicode[Times]{x3C1}}\)\(\def\upsigma{\unicode[Times]{x3C3}}\)\(\def\uptau{\unicode[Times]{x3C4}}\)\(\def\upupsilon{\unicode[Times]{x3C5}}\)\(\def\upphi{\unicode[Times]{x3C6}}\)\(\def\upchi{\unicode[Times]{x3C7}}\)\(\def\uppsy{\unicode[Times]{x3C8}}\)\(\def\upomega{\unicode[Times]{x3C9}}\)\(\def\bialpha{\boldsymbol{\alpha}}\)\(\def\bibeta{\boldsymbol{\beta}}\)\(\def\bigamma{\boldsymbol{\gamma}}\)\(\def\bidelta{\boldsymbol{\delta}}\)\(\def\bivarepsilon{\boldsymbol{\varepsilon}}\)\(\def\bizeta{\boldsymbol{\zeta}}\)\(\def\bieta{\boldsymbol{\eta}}\)\(\def\bitheta{\boldsymbol{\theta}}\)\(\def\biiota{\boldsymbol{\iota}}\)\(\def\bikappa{\boldsymbol{\kappa}}\)\(\def\bilambda{\boldsymbol{\lambda}}\)\(\def\bimu{\boldsymbol{\mu}}\)\(\def\binu{\boldsymbol{\nu}}\)\(\def\bixi{\boldsymbol{\xi}}\)\(\def\biomicron{\boldsymbol{\micron}}\)\(\def\bipi{\boldsymbol{\pi}}\)\(\def\birho{\boldsymbol{\rho}}\)\(\def\bisigma{\boldsymbol{\sigma}}\)\(\def\bitau{\boldsymbol{\tau}}\)\(\def\biupsilon{\boldsymbol{\upsilon}}\)\(\def\biphi{\boldsymbol{\phi}}\)\(\def\bichi{\boldsymbol{\chi}}\)\(\def\bipsy{\boldsymbol{\psy}}\)\(\def\biomega{\boldsymbol{\omega}}\)\(\def\bupalpha{\unicode[Times]{x1D6C2}}\)\(\def\bupbeta{\unicode[Times]{x1D6C3}}\)\(\def\bupgamma{\unicode[Times]{x1D6C4}}\)\(\def\bupdelta{\unicode[Times]{x1D6C5}}\)\(\def\bupepsilon{\unicode[Times]{x1D6C6}}\)\(\def\bupvarepsilon{\unicode[Times]{x1D6DC}}\)\(\def\bupzeta{\unicode[Times]{x1D6C7}}\)\(\def\bupeta{\unicode[Times]{x1D6C8}}\)\(\def\buptheta{\unicode[Times]{x1D6C9}}\)\(\def\bupiota{\unicode[Times]{x1D6CA}}\)\(\def\bupkappa{\unicode[Times]{x1D6CB}}\)\(\def\buplambda{\unicode[Times]{x1D6CC}}\)\(\def\bupmu{\unicode[Times]{x1D6CD}}\)\(\def\bupnu{\unicode[Times]{x1D6CE}}\)\(\def\bupxi{\unicode[Times]{x1D6CF}}\)\(\def\bupomicron{\unicode[Times]{x1D6D0}}\)\(\def\buppi{\unicode[Times]{x1D6D1}}\)\(\def\buprho{\unicode[Times]{x1D6D2}}\)\(\def\bupsigma{\unicode[Times]{x1D6D4}}\)\(\def\buptau{\unicode[Times]{x1D6D5}}\)\(\def\bupupsilon{\unicode[Times]{x1D6D6}}\)\(\def\bupphi{\unicode[Times]{x1D6D7}}\)\(\def\bupchi{\unicode[Times]{x1D6D8}}\)\(\def\buppsy{\unicode[Times]{x1D6D9}}\)\(\def\bupomega{\unicode[Times]{x1D6DA}}\)\(\def\bupvartheta{\unicode[Times]{x1D6DD}}\)\(\def\bGamma{\bf{\Gamma}}\)\(\def\bDelta{\bf{\Delta}}\)\(\def\bTheta{\bf{\Theta}}\)\(\def\bLambda{\bf{\Lambda}}\)\(\def\bXi{\bf{\Xi}}\)\(\def\bPi{\bf{\Pi}}\)\(\def\bSigma{\bf{\Sigma}}\)\(\def\bUpsilon{\bf{\Upsilon}}\)\(\def\bPhi{\bf{\Phi}}\)\(\def\bPsi{\bf{\Psi}}\)\(\def\bOmega{\bf{\Omega}}\)\(\def\iGamma{\unicode[Times]{x1D6E4}}\)\(\def\iDelta{\unicode[Times]{x1D6E5}}\)\(\def\iTheta{\unicode[Times]{x1D6E9}}\)\(\def\iLambda{\unicode[Times]{x1D6EC}}\)\(\def\iXi{\unicode[Times]{x1D6EF}}\)\(\def\iPi{\unicode[Times]{x1D6F1}}\)\(\def\iSigma{\unicode[Times]{x1D6F4}}\)\(\def\iUpsilon{\unicode[Times]{x1D6F6}}\)\(\def\iPhi{\unicode[Times]{x1D6F7}}\)\(\def\iPsi{\unicode[Times]{x1D6F9}}\)\(\def\iOmega{\unicode[Times]{x1D6FA}}\)\(\def\biGamma{\unicode[Times]{x1D71E}}\)\(\def\biDelta{\unicode[Times]{x1D71F}}\)\(\def\biTheta{\unicode[Times]{x1D723}}\)\(\def\biLambda{\unicode[Times]{x1D726}}\)\(\def\biXi{\unicode[Times]{x1D729}}\)\(\def\biPi{\unicode[Times]{x1D72B}}\)\(\def\biSigma{\unicode[Times]{x1D72E}}\)\(\def\biUpsilon{\unicode[Times]{x1D730}}\)\(\def\biPhi{\unicode[Times]{x1D731}}\)\(\def\biPsi{\unicode[Times]{x1D733}}\)\(\def\biOmega{\unicode[Times]{x1D734}}\)\({{\cal H}_0}\), and (b) the state of microsaccade, denoted as Display Formula\({{\cal H}_1}\). The tremor component during eye fixation is not taken into consideration as it is very low in amplitude and very few gaze trackers can measure it reliably (Ko, Snodderly, & Poletti, 2016). We present our statistical model for Display Formula\({{\cal H}_0}\), first. Then, the model for Display Formula\({{\cal H}_1}\) is discussed with several of its properties derived. 
Model for \({\boldsymbol {\cal H}_{\bf 0}}\)
Let k denote the discrete time index for gaze position measurements and Display Formula\(k\in {{{\Bbb Z}}^{+}}\), where ℤ+ denotes the set of positive integers, i.e., {1, 2, 3, ...}; Xk denotes the actual (noise-free) gaze position as a visual angle at time k; and Zk denotes the measured eye-gaze position in visual angle at time k. It has been shown that the correlated random walk constitutes an accurate model for the drifting eye-gaze movement (Engbert & Kliegl, 2004; Matin, Matin, & Pearce, 1970). However, the correlation factors in Engbert and Kliegl (2004) are not trivial to estimate from the dataset itself. Thus, we propose the use of a simplified yet more straightforward model based on Engbert and Kliegl (2004), where only first-order correlation is considered. More specifically, we propose the following model for state Display Formula\({{\cal H}_0}\):  
\begin{equation}\tag{1}\left\{ {\matrix{ {{X_k}} \hfill&{ = {X_{k - 1}} + {W_k}} \hfill \cr {{Z_k}} \hfill&{ = {X_k} + {V_k}} \hfill \cr } } \right.\end{equation}
where Display Formula\({W_k}\sim {\cal N}(0,\sigma _w^2)\) is the noise that models the drifting movement of the eye; Display Formula\({V_k}\sim {\cal N}(0,\sigma _v^2)\) is the measurement noise; Wk and Vk are independent; Display Formula\(\forall k\in {\Bbb Z}^{+}\); σw, Display Formula\({{\sigma }_{v}}\in {\Bbb R}^{+}\) represent the standard deviation for Wk and Vk respectively; and Display Formula\({\cal N}(\beta ,\gamma )\) stands for the Gaussian distribution with mean β and variance γ.  
Model for \({{\cal H}_1}\)
In Han, Saunders, Woods, and Luo (2013), the generalized exponential function was proposed to model the shape of a saccade starting at time t = 0:  
\begin{equation}\tag{2}h(t;\bitheta ) = \left\{ {\matrix{ {} \hfill&{{\theta _1}\left\{ {1 - \exp \left[ { - {{\left( {{t \over {{\theta _2}}}} \right)}^{{\theta _3}}}} \right]} \right\},\qquad {\rm{\ }}t \ge 0} \hfill \cr {} \hfill&{0,\qquad {\rm{\ }}t \lt 0} \hfill \cr } } \right.\end{equation}
where Display Formula\(\bitheta ={{[{{\theta }_{1}},{{\theta }_{2}},{{\theta }_{3}}]}^{T}}\in {{{\Bbb R}}^{3}}\) is a parameter vector, Display Formula\(t\in {\Bbb R}\) is the continuous time index, and T denotes the transposed operation. We need Display Formula\({\theta _1} \in {\Bbb R}\), θ2 > 0 and θ3 > 1. As saccades and microsaccades are thought to share the same generating neural mechanism (Martinez-Conde, Otero-Millan, & Macknik, 2013), we adopt the same generalized exponential function to model a microsaccade. Figure 1 shows three examples of h(t; θ) with different parameters. We can see that the parameter θ1 controls the amplitude of the function; θ2 and θ3 control the shape of the function. It is also demonstrated that the velocity profile of a microsaccade can be captured by Equation 2,  
\begin{equation}\tag{3}v(t;\bitheta ) = {{\partial h(t;\bitheta )} \over {\partial t}} = {{{\theta _1}{\theta _3}} \over {{\theta _2}}}\exp \left[ { - {{\left( {{t \over {{\theta _2}}}} \right)}^{{\theta _3}}}} \right]{\left( {{t \over {{\theta _2}}}} \right)^{{\theta _3} - 1}} \end{equation}
which is a very good approximation of the actual velocity profile v(t; θ) of a microsaccade as studied in Van Opstal and Van Gisbergen (1987). From Equation 2 and Figure 1, we also notice the following facts: (a) h(t; θ) converges to θ1, as t → +∞; and (b) the time that v(t; θ) reaches its maximum is dependent on θ. Therefore, we need to define the beginning and end of a microsaccade event explicitly based on Equation 2. We define the duration of a microsaccade, denoted as d, that follows Equation 2, as the time span [t1, t2] that its amplitude raises from 0.5θ1(1 – dm) to 0.5θ1(1 + dm), where dm ∈ (0,1) is a scaling factor. Therefore, during the defined duration of a microsaccade, dm × 100% of the total amplitude θ1 has finished. Figure 1 helps demonstrate the relation of θ1, dm, d, t1, and t2. It can be calculated that:  
\begin{equation}\tag{4}{t_1} = {\theta _2}{\left[ { - \ln \left( {0.5 + 0.5{d_m}} \right)} \right]^{{1 \over {{\theta _3}}}}}\end{equation}
 
\begin{equation}\tag{5}{t_2} = {\theta _2}{\left[ { - \ln \left( {0.5 - 0.5{d_m}} \right)} \right]^{{1 \over {{\theta _3}}}}}\end{equation}
 
\begin{equation}\tag{6}d = {\rm{\ }}{t_2} - {t_1}.\end{equation}
 
Figure 1
 
Generalized exponential function from Equation 2 and the interpretation of parameters in Equations 3, 4, 5, 6, 9, and 10. Comparing Functions 1 (in blue) and 2 (in red), we notice that θ2 controls onset of the saccadic event. Comparing Functions 2 (in red) and 3 (in yellow), we notice that θ3 controls the abruptness of the amplitude change.
Figure 1
 
Generalized exponential function from Equation 2 and the interpretation of parameters in Equations 3, 4, 5, 6, 9, and 10. Comparing Functions 1 (in blue) and 2 (in red), we notice that θ2 controls onset of the saccadic event. Comparing Functions 2 (in red) and 3 (in yellow), we notice that θ3 controls the abruptness of the amplitude change.
A value close to 1 should be chosen for dm, and in this article, we choose dm = 0.95. Figure 1 can also help demonstrate the relation between parameters t1, t2, d, h(t; θ), and v(t; θ). 
It is natural to modify h(t; θ) based on the defined starting and end of a microsaccade, as:  
\begin{equation}\tag{7}{h_s}(t;\bitheta ) = h(t + {t_1};\bitheta ).\end{equation}
where hs(t; θ) is the shifted function of h(t; θ), such that the defined starting time of a microsaccade coincides with the time t = 0. For sampled data with sampling interval Display Formula\(\Delta \ { \in {\Bbb R} ^ + }\), we define  
\begin{equation}\tag{8}H(k;\bitheta ) = {h_s}(k\Delta ;\bitheta ),\quad k{ \in {\Bbb Z} ^ + }\end{equation}
 
By letting Display Formula\({{\partial v(t;\bitheta )} \over {\partial t}} = 0\), we can also calculate the time, tv, at which a microsaccade reaches its maximum velocity, vmax:  
\begin{equation}\tag{9}{t_v} = {\theta _2}{\left( {{{{\theta _3} - 1} \over {{\theta _3}}}} \right)^{{1 \over {{\theta _3}}}}}\end{equation}
 
\begin{equation}\tag{10}{v_{max}} = {{{\theta _1}{\theta _3}} \over {{\theta _2}}}\exp \left( {{{ - {\theta _3} + 1} \over {{\theta _3}}}} \right){\left( {{{{\theta _3} - 1} \over {{\theta _3}}}} \right)^{{{{\theta _3} - 1} \over {{\theta _3}}}}}\end{equation}
 
H(k; θ) can capture the amplitude and velocity profile of a microsaccade parametrically. Therefore, we can build the statistical model as:  
\begin{equation}\tag{11}\left\{ {\matrix{ {{X_k}} \hfill&{ = {X_{k - 1}} + {W_k} + {H_d}(k;\bitheta ,\omega )} \hfill \cr {{Z_k}} \hfill&{ = {X_k} + {V_k}} \hfill \cr } } \right.\end{equation}
where  
\begin{equation}\tag{12}{H_d}(k;\bitheta ,\omega ) = H(k - \omega ;\bitheta ) - H(k - \omega - 1;\bitheta ).\end{equation}
and Display Formula\(\omega \in {\Bbb Z}\) is the start time of a microsaccade; Display Formula\(0 \lt k + \omega \le d\), such that Equation 11 is only valid during the defined duration of a microsaccade; the drifting noise Wk and measurement noise Vk follow the same distribution as specified by Equation 1; and the starting value is Xω = 0. Comparing Equation 11 with Equation 1, we can note the only difference comes from the term Hd(k; θ, ω), and the essence of Hd(k; θ, ω) is that it models a deterministic displacement to represent the microsaccade. It is not hard to see that:  
\begin{equation}\tag{13}{\bf{E}}\left[ {{X_k}} \right] = H(k + \omega ;\bitheta )\end{equation}
where Display Formula\({\bf{E}}\left[ \cdot \right]\) represents the theoretical expectation.  
Proposed detection method
In the previous section, we built the statistical model for gaze data, and in this section, the statistical detection method for microsaccades is derived based on this model. We apply the widely used NPD method (Helstrom, 1971; Van Trees, 2004) from signal processing to the problem of saccadic event detection, specifically to test the hypothesis whether the eye is in state Display Formula\({{\cal H}_0}\) (drift) or state Display Formula\({{\cal H}_1}\) (saccadic event). While the Bayesian method makes use of prior distributions over the parameters, the Neyman Pearson method uses the likelihood ratio test for the two eye states and gives equal weight to the two possibilities. Therefore, in certain situations, such as datasets with high individual differences in microsaccade generation, priors over parameters might not be feasible, and the Neyman Pearson approach may provide a better alternative for microsaccade detection. Based on our model, a sequence of gaze data can be sectioned into chunks with two possible states. A demonstration is shown in Figure 2, where the data points can be sectioned into three chunks. Data in the left and right chunks follow state Display Formula\({{\cal H}_0}\) and the data in the middle chunk follow state Display Formula\({{\cal H}_1}\). A microsaccade detector needs to provide us with the position of the boundary between different state chunks. Therefore, NPD can be utilized naturally to solve the two-state hypothesis testing problem. We can declare a detection of a microsaccade if state Display Formula\({{\cal H}_1}\) is produced by NPD. Yet, before implementing NPD, several parameters need to be estimated. To demonstrate the proposed detection method, first, we show the moment based parameter estimation method for Display Formula\(\sigma _v^2\) and Display Formula\(\sigma _w^2\), as well as the maximum likelihood estimation (MLE) method for estimating θ. Then, we derive the test statistics for NPD under the NPD framework. Lastly, we unify the parameter estimation method with NPD to perform the task of detecting saccadic events, where some postprocessing steps are applied to further enhance the performance of the detector. 
Figure 2
 
Demonstration of a simulated gaze position measurements and the two states \({{\cal H}_0}\) and \({{\cal H}_1}\).
Figure 2
 
Demonstration of a simulated gaze position measurements and the two states \({{\cal H}_0}\) and \({{\cal H}_1}\).
Parameter estimation
Estimation of \(\sigma _v^2\) and \(\sigma _w^2\)
We employ the moment based estimation method for Display Formula\(\sigma _v^2\) and Display Formula\(\sigma _w^2\), which follows the method developed in Zhu (2018), where the unknown variance is estimated using linear regression. The detailed application of the method in Zhu (2018) is presented next. 
Under state Display Formula\({{\cal H}_0}\), we have:  
\begin{equation}\tag{14}{Z_k} - {Z_{k - n}} = \sum\limits_{i = 1}^n {{W_{k + i}}} + {V_{k + n}} - {V_k}\end{equation}
where Display Formula\(n\in {{{\Bbb Z}}^{+}}\), and therefore, we have:  
\begin{equation}\tag{15}{Z_k} - {Z_{k - n}}\sim {\cal N}\left( {0,n\sigma _w^2 + 2\sigma _v^2} \right).\end{equation}
 
With Equation 15, we know that if we can estimate the variance of Display Formula\({Z_k} - {Z_{k - n}}\), we can then solve for Display Formula\(\sigma _v^2\) and Display Formula\(\sigma _w^2\). Furthermore, we can also notice that Display Formula\(\Bbb {VAR} \it \left[ {{Z_k} - {Z_{k - n}}} \right]\) is a linear function with respect to n, where Display Formula\(\Bbb {VAR} \left[ \cdot \right]\) is the empirical variance. Hence, Display Formula\(\sigma _v^2\) and Display Formula\(\sigma _w^2\) can be estimated via linear regression, such that the regression slope is Display Formula\(\sigma _w^2\), and one half of the linear regression intersection is Display Formula\(\sigma _v^2\)
Under state Display Formula\({{\cal H}_1}\), we have:  
\begin{equation}\tag{16}{Z_k} - {Z_{k - n}} = \sum\limits_{i = 1}^n {{W_{k + i}}} + {V_{k + n}} - {V_k} + H(k + \omega ;\bitheta ) - H(k - n + \omega ;\bitheta ). \end{equation}
 
Therefore, we can find that:  
\begin{equation}\tag{17}{\bf{E}}\left[ {{Z_k} - {Z_{k - n}}} \right] = H(k + \omega ;\bitheta ) - H(k - n + \omega ;\bitheta )\end{equation}
and  
\begin{equation}\tag{18}{\bf{VAR}}\left[ {{Z_k} - {Z_{k - n}}} \right] = n\sigma _w^2 + 2\sigma _v^2.\end{equation}
 
The distribution for Zk – Zk–n under state Display Formula\({{\cal H}_1}\) has a nonzero mean, which is different from that under state Display Formula\({{\cal H}_0}\). Based on the distribution differences of ZkZk–n under two states, we propose Algorithm 1 to estimate Display Formula\(\sigma _v^2\) and Display Formula\(\sigma _w^2\)
Algorithm 1 Estimate Display Formula\(\sigma _v^2\) and Display Formula\(\sigma _w^2\) 
  1.  
    Get a vector of measured gaze position Display Formula\({Z_{{k_0}:{k_1}}} = {\left[ {{Z_{{k_0}}},{Z_{{k_0} + 1}},...,{Z_{{k_1}}}} \right]^T}\)
  2.  
    Section Display Formula\({Z_{{k_0}:{k_1}}}\) into Display Formula\(m\ { \in {\Bbb Z} ^ + }\) chunks, such that Display Formula\({\left[ {{{\boldsymbol Z}_1},...,{{\boldsymbol Z}_m}} \right]^T} = {Z_{{k_0}:{k_1}}}\)
  3.  
    Pick a value Display Formula\({n_0}\ { \in {\Bbb Z} ^ + }\). Construct n0 empty sets Display Formula\({{\boldsymbol V}_{1:{n_0}}}\), and an all-zero vector Display Formula\({G_{1:{n_0}}}\)
  4.  
    for i = 1, 2, ..., m do
  5.  
     Let Yk denotes the kth element in Zi, and Zi has N elements in total
  6.  
    for j = 1, 2, ..., n0 do
  7.  
      Calculate vector Display Formula\({\boldsymbol D} = {Y_{1 + j:N}} - {Y_{1:N - j}}\), and calculate Display Formula\(\sigma _D^2 = \Bbb {VAR} \left[{\boldsymbol D} \right]\)
  8.  
      Add Display Formula\(\sigma _D^2\) to the set Vj
  9.  
    for j = 1, 2, ..., n0 do
  10.  
     Let Display Formula\({G_j} = {\boldsymbol {min}}\left[ {{{\boldsymbol V}_j}} \right]\)
  11.  
    Perform linear regression Gj = αj + β with respect to j
  12.  
    The estimate of Display Formula\(\sigma _v^2\) is Display Formula\(\hat \sigma _v^2 = 0.5\beta \)
  13.  
    The estimate of Display Formula\(\sigma _w^2\) is Display Formula\(\hat \sigma _w^2 = \alpha \)
Let the empirical mean and variance of elements in vector D be denoted as:  
\begin{equation}\tag{19}{\Bbb E} \left[{\boldsymbol D} \right] =\it {{\rm1} \over M}\sum\limits_{i = {\rm1}}^M {{D_i}} \end{equation}
 
\begin{equation}\tag{20}\Bbb {VAR} \left[{\boldsymbol D} \right] = {\rm 1 \over\it {M - \rm1}}\sum\limits_{\it i = \rm1}^{\it M} {{{\left( {{\it D_i} - {\Bbb E} \left[{\boldsymbol D} \right]} \right)}^2}} \end{equation}
where Display Formula\(M\ \gt\ 1,M\ { \in {\Bbb Z} ^ + }\), is the size of D, and Di is the ith element.  
Algorithm 1 attempts to calculate the variance of Display Formula\({Z_k} - {Z_{k - n}}\) empirically, Display Formula\(\forall {k_0} \le k \le {k_1}\), for different values of n. The key point in Algorithm 1 is Step 10 where we calculate the minimum Gj = min [Vj] of the elements Vj . From Equations 14 through 18, we know that the empirical variance will be larger if (Zk – Zk–n) follows the state Display Formula\({{\cal H}_1}\), compared with that when (Zk – Zk–n) follows the state Display Formula\({{\cal H}_0}\). Therefore, the calculation of the minimum variance guarantees that the estimation of the variance is from a subsection of the state Display Formula\({{\cal H}_0}\), and not Display Formula\({{\cal H}_1}\)
Figure 3 shows the estimated Display Formula\(n\sigma _w^2 + 2\sigma _v^2\) with respect to different n values using Algorithm 1 with the real gaze data provided in Mihali et al. (2017). The dotted line in Figure 3a depicts the estimated variance from the dataset, and the dashed line in Figure 3a shows the theoretical variance from fitted data points. We note from Figure 3a that as n increases, the estimated value does not always exhibit a linear increment of the same rate. That is because measured gaze position is in the unit of visual angle, which cannot exceed 180°. Therefore, the random walk model can only fit the data when n is small, and in our case, n ≤ 30. Also, we notice that the estimated value at n = 1 is an obvious outlier from the rest of the data points. Its existence is due to the limitation of the operation Display Formula\({\boldsymbol {min}}[\cdot]\). When n is very small, the value of Display Formula\({\bf{E}}\left[ {{Z_k} - {Z_{k - n}}{\rm{|}}{{\cal H}_1}} \right]\) is very close to Display Formula\({\bf{E}}\left[ {{Z_k} - {Z_{k - n}}{\rm{|}}{{\cal H}_0}} \right] = 0\), making it harder to differentiate between two states by calculating the min[Vj]. Therefore, in practice, we only use the n values range from 2 to 10 to perform the estimation. Figure 3b shows the linear regression results for n = 2, 3, ..., 30. The resulting R2 score is 0.9942, suggesting that our linear model fits the data very well. 
Figure 3
 
Demonstration of \(\sigma _w^2\) and \(\sigma _v^2\) estimation.
Figure 3
 
Demonstration of \(\sigma _w^2\) and \(\sigma _v^2\) estimation.
MLE for θ1, θ2, and θ3
Now we know how to estimate Display Formula\(\sigma _v^2\) and Display Formula\(\sigma _w^2\) from the data, and based on the estimated Display Formula\(\sigma _v^2\) and Display Formula\(\sigma _w^2\), we can exploit the MLE framework to estimate θ. Suppose from k0 to k1, the measured gaze data, Display Formula\({Z_{{k_0}:k_1}}\), is at the state Display Formula\({{\cal H}_1}\). Based on Equation 11, we know that:  
\begin{equation}\tag{21}{C_k} = {Z_k} - {Z_{k - 1}} = {W_k} + {V_k} - {V_{k + 1}} + {H_d}(k;\bitheta ,{k_0})\sim {\cal N}\left( {{\mu _k},\sigma _c^2} \right),\quad {k_0} \lt k \le {k_1} \end{equation}
where Display Formula\({\mu _k} = {H_d}(k;\bitheta ,{k_0}),\sigma _c^2 = \sigma _w^2 + 2\sigma _v^2\). Hence Display Formula\({C_{{k_0} + 1:{k_1}}}\) follows the multivariate Gaussian distribution, such that:  
\begin{equation}\tag{22}{C_{{k_0} + 1:{k_1}}}\sim {\cal N}\left( {\bimu ,\bSigma } \right)\end{equation}
where  
\begin{equation}\tag{23}\mu = {\left[ {{\mu _{{k_0} + 1}},...,{\mu _{{k_1}}}} \right]^T}\end{equation}
and  
\begin{equation}\tag{24}\bSigma = \left[ {\matrix{ {\sigma _c^2}&{ - \sigma _v^2}&0&\ldots&0 \cr { - \sigma _v^2}&{\sigma _c^2}&{ - \sigma _v^2}&\ldots&0 \cr 0&{ - \sigma _v^2}&{ - \sigma _c^2}&\ldots&0 \cr \vdots&\vdots&\vdots&\ddots&\vdots \cr 0&0&0&\ldots&{\sigma _c^2} \cr } } \right].\end{equation}
 
We have the covariance matrix Σ as in Equation 24 because, in our model, only the first order correlation is considered. Thus, all off-diagonal elements are zero except for the elements directly adjacent to the diagonal elements. Mathematically, we can see that the correlation between Ck and Display Formula\({C_{k + 1}}\) is due to the common factor Display Formula\({V_{k + 1}}\). Ck and Display Formula\({C_{k + j}}\), j > 1, are uncorrelated as they share no correlated factor. Let k0 + 1 = k2, and we can write the probability density function of Display Formula\({C_{{k_2}:{k_1}}}\) as:  
\begin{equation}\tag{25}f\left( {{C_{{k_2}:{k_1}}}|\bitheta ,{{\cal H}_1}} \right) = \det {\left( {2\pi \bSigma } \right)^{ - {1 \over 2}}}\exp \left[ { - {1 \over 2}{{\left( {{C_{{k_2}:{k_1}}} - \bimu } \right)}^T}} {\cdot{\bSigma ^{ - 1}}\left( {{C_{{k_2}:{k_1}}} - \bimu } \right)} \right]. \end{equation}
 
The log-likelihood function is:  
\begin{equation}\tag{26}\ln f\left( {{C_{{k_2}:{k_1}}}{\rm{|}}\bitheta ,{{\cal H}_1}} \right) = - {1 \over 2}\ln \det \left( {2\pi \bSigma } \right) - {1 \over 2}\left[ {{{\left( {{C_{{k_2}:{k_1}}} - \bimu } \right)}^T}} {\cdot{\bSigma ^{ - 1}}\left( {{C_{{k_2}:{k_1}}} - \bimu } \right)} \right].\end{equation}
 
From Equation 26, we can have the maximum likelihood estimator for θ as:  
\begin{equation}\tag{27}{\widehat \bitheta _{ML}} = \arg \mathop {\max }\limits_\bitheta \ln f\left( {{C_{{k_2}:{k_1}}}{\rm{|}}\bitheta ,{{\cal H}_1}} \right) = \arg \mathop {\max }\limits_\bitheta \left( {2{\bimu ^T}{\bSigma ^{ - 1}}{C_{{k_2}:{k_1}}} - {\bimu ^T}{\bSigma ^{ - 1}}\bimu } \right) \end{equation}
 
To find out the maximum likelihood estimator for θ1, Display Formula\({\hat \theta _{1ML}}\), in Equation 27, we can solve the equation:  
\begin{equation}\tag{28}{{\partial \ln f\left( {{C_{{k_2}:{k_1}}}|\bitheta ,{{\cal H}_1}} \right)} \over {\partial {\theta _1}}} = 0\end{equation}
 
Hence, we can find that:  
\begin{equation}\tag{29}{\hat \theta _{1ML}} = {{{{\bar \bimu }^T}{\bSigma ^{ - 1}}{C_{{k_2}:{k_1}}}} \over {{{\bar \bimu }^T}{\bSigma ^{ - 1}}\bar \bimu }}\end{equation}
 
Where Display Formula\(\bar \bimu = {1 \over {{\theta _1}}}\bimu \). A detailed derivation of Equations 27 and 29 is shown in the Appendices. 
We use the Simplex algorithm (Lagarias, Reeds, Wright, & Wright, 1998) to estimate Display Formula\({\hat \theta _{2ML\;\,}}\) and Display Formula\({\hat \theta _{3ML}}\), which is an efficient algorithm to find the minima without employing the function derivatives. Once we calculate the Display Formula\({\hat \theta _{2ML}}\) and Display Formula\({\hat \theta _{3ML}}\), we can use Equation 29 to find out Display Formula\({\hat \theta _{1ML}}\)
Test statistics for NPD
The detection of microsaccades, as mentioned before, in our settings, can be solved as a hypothesis testing problem under the NPD framework. Suppose that Display Formula\({Z_{{k_1}:{k_2}}}\) is a section of measured gaze positions from k1 to k2. Hence the detection of a microsaccade in Display Formula\({Z_{{k_1}:{k_2}}}\) can be considered as the problem of determining the correct hypothesis between Display Formula\({{\cal H}_0}\) and Display Formula\({{\cal H}_1}\). We can therefore rephrase our detection problem as:  
\begin{equation}{{\cal H}_0}\mbox{:}{\rm{\ }}{Z_{{k_1}:{k_2}}}\ {\rm{follows\ Equation\ }}1\end{equation}
 
\begin{equation}{{\cal H}_1}\mbox{:}\ {Z_{{k_1}:{k_2}}}\ {\rm{follows\ Equation\ }}11.\end{equation}
Or equivalently:  
\begin{equation}{{\cal H}_0}\mbox{:}\ {\boldsymbol C} = {Z_{{k_1} + 1:{k_2}}} - {Z_{{k_1} + 1:{k_2} - 1}}\sim {\cal N}\left( {{\bf0},\bSigma } \right)\end{equation}
 
\begin{equation}{{\cal H}_1}\mbox{:}\ {\boldsymbol C} = {Z_{{k_1} + 1:{k_2}}} - {Z_{{k_1} + 1:{k_2} - 1}}\sim {\cal N}\left( {\bimu ,\bSigma } \right).\end{equation}
where μ and Σ are the same as in Equation 22. The log-likelihood function for C under Display Formula\({{\cal H}_1}\) is derived in Equation 26, and similarly, we compute the log-likelihood function for C under Display Formula\({{\cal H}_0}\) as:  
\begin{equation}\tag{30}\ln f\left( {{\boldsymbol C}{\rm{|}}\bitheta ,{{\cal H}_0}} \right) = - {1 \over 2}\ln \det \left( {2\pi \;\bSigma } \right) - {1 \over 2}\left[ {{{\boldsymbol C}^T}{\bSigma ^{ - 1}}{\boldsymbol C}} \right]. \end{equation}
 
From Equations 26 and 30, and according to the NPD framework (Casella & Berger, 2002; Van Trees, 2004), we can calculate the log-likelihood ratio as:  
\begin{equation}\tag{31}\ln \Lambda = \ln \left[ {{{f\left( {{\boldsymbol C}{\rm{|}}\boldsymbol \theta ,{{\cal H}_1}} \right)} \over {f\left( {{\boldsymbol C}{\rm{|}}\boldsymbol \theta ,{{\cal H}_0}} \right)}}} \right] = {\boldsymbol \mu ^T}{\bf \Sigma ^{ - 1}}{\boldsymbol C}. \end{equation}
 
From Equation 31, we know that the test statistics for NPD are:  
\begin{equation}\tag{32}T = {\rm{\ }}{\boldsymbol \mu ^T}{\bf \Sigma ^{ - 1}}{\boldsymbol C}\end{equation}
and we can then further rephrase our hypothesis testing problem as:  
\begin{equation}\eqalign{&{{\cal H}_0}\mbox{:}\ T\sim {\cal N}\left( {\bf0,{\bimu ^T}{\bSigma ^{ - 1}}\bimu } \right) \cr&{{\cal H}_1}\mbox{:}\ T\sim {\cal N}\left( {{\bimu ^T}{\bSigma ^{ - 1}}\bimu ,{\bimu ^T}{\bSigma ^{ - 1}}\bimu } \right). \cr} \end{equation}
 
Now, the problem of detecting a microsaccade has been transformed into the problem of determining the correct distribution between Display Formula\({\cal N}\left( {{\bf0},{\bimu ^T}{\bSigma ^{ - 1}}\bimu } \right)\) and Display Formula\({\cal N}\left( {{\bimu ^T}{\bSigma ^{ - 1}}\bimu ,{\bimu ^T}{\bSigma ^{ - 1}}\bimu } \right)\), given T. Then the NPD theory tells us that the Neyman Pearson test can be performed as:  
\begin{equation}\tag{33}T \overset{{\cal H}_1}{\underset{{\cal H}_0}{\gtrless}} \rho \end{equation}
where ρ is the threshold based on the significance value α. We can calculate ρ as:  
\begin{equation}\tag{34}\rho = \sqrt {{\bimu ^T}{\bSigma ^{ - 1}}\bimu } {F^{ - 1}}(1 - \alpha )\end{equation}
where Display Formula\({F^{ - 1}}(\cdot)\) is the inverse cumulative distribution function for the standard Gaussian distribution, such that the probability of false alarm, PFA, equals to α. Some common choices of α are 0.05 and 0.01. We can also calculate the probability of detection at significant value α as:  
\begin{equation}\tag{35}{P_D} = 1 - F\left( {{{\rho - {\bimu ^T}{\bSigma ^{ - 1}}\bimu } \over {\sqrt {{\bimu ^T}{\bSigma ^{ - 1}}\bimu } }}} \right)\end{equation}
where Display Formula\(F(\cdot)\) is the cumulative distribution function for the standard Gaussian distribution. The optimality of NPD, according to the Neyman Pearson lemma, guarantees that NPD with threshold in Equation 33 is the most powerful test given PFA.  
NPD algorithm
In the two previous sections, the necessary components to detect a microsaccade were derived. In this section, we will combine the proposed parameter estimation methods with NPD. 
The proposed NPD algorithm is shown in Algorithm 2. From Algorithm 2, we know that we apply a sliding window of length L on the data vector Display Formula\({Z_{{k_0}:{k_1}}}\), and treat the windowed data points as either in state Display Formula\({{\cal H}_0}\) or Display Formula\({{\cal H}_1}\) to perform the Neyman Pearson hypothesis tests. Steps 13 through 17 are needed to postprocess the detection results. The aim of the added steps is to rectify the problem of overfitting, and we explain these steps in the following paragraphs. 
Algorithm 2 The NPD for microsaccades 
  1.  
    Get a sequence of tracked eye-gaze positions Display Formula\({Z_{{k_0}:{k_1}}}\)
  2.  
    Apply Algorithm 1 to estimate Display Formula\(\sigma _v^2\) and Display Formula\(\sigma _w^2\)
  3.  
    Pick a value for Display Formula\(L\ { \in {\Bbb Z} ^ + }\), which is the length that we want to section vector Display Formula\({Z_{{k_0}:{k_1}}}\)
  4.  
    for i = k0, k0 + 1, ..., k1L + 1 do
  5.  
     Let Display Formula\({\boldsymbol Z} = {Z_{i:i + L - 1}}\), which is a subsection of Display Formula\({Z_{{k_0}:{k_1}}}\)
  6.  
     Calculate Display Formula\({\hat \bitheta _i}\) by applying MLE based on data points in Z
  7.  
     Calculate T using Equation 32 with Display Formula\({\hat \bitheta _i}\)
  8.  
     Calculate ρ using Equation 34
  9.  
    if T > ρ then
  10.  
      Ji = 1
  11.  
    else
  12.  
      Ji = 0
  13.  
    for Display Formula\(i = {k_0},{k_0} + 1,...,{k_1} - L + 1\) do
  14.  
    if Display Formula\({\hat \theta _{1ML}}^{(i)} \lt {T_{{\theta _1}}}{\boldsymbol {median}}\left[ {\left| {{{\hat \bitheta }_{1ML}}} \right|} \right]\) then
  15.  
      Ji = 0
  16.  
    The NPD return the vector Display Formula\({J_{{k_0}:{k_1} - L + 1}}\) and matrix Display Formula\({\hat \bitheta _{{k_0}:{k_1} - L + 1}}\)
  17.  
    Post process Display Formula\({J_{{k_0}:{k_1} - L + 1}}\) and Display Formula\({\hat \bitheta _{{k_0}:{k_1} - L + 1}}\) and yield detected saccadic events
The MLE Algorithm is used regardless of whether the windowed data points are in state Display Formula\({{\cal H}_0}\) or Display Formula\({{\cal H}_1}\). Therefore, overfitting can occur if we arbitrarily fit data points in state Display Formula\({{\cal H}_0}\) to the model of state Display Formula\({{\cal H}_1}\) using MLE. Such overfitting can also increase the probability of false alarm; indeed, we notice that when θ1 is very close to 0, the difference between Display Formula\({{\cal H}_0}\) and Display Formula\({{\cal H}_1}\) is very small. One extreme condition is that Display Formula\({{\cal H}_0} = {{\cal H}_1}\), when θ1 = 0, reducing the proposed detection method to a random guess. To avoid the overfitting and rectify the increased probability of false alarm, a thresholding is needed for the parameter θ1. Figures A1b and A1d show the estimated θ1 for a real gaze signal. Since θ1 is estimated using state model Display Formula\({{\cal H}_1}\), we are arbitrarily fitting the windowed data points to follow a saccadic eye movement. We can see from Figures A1b and A1d that the absolute amplitude Display Formula\(\left| {{\theta _1}} \right|\) of θ1 is large when a true saccadic event has occurred and Display Formula\(\left| {{\theta _1}} \right|\) is small if there is no saccadic event. Also, based on the fact that the state Display Formula\({{\cal H}_0}\) is more commonly observed than the state Display Formula\({{\cal H}_1}\) (the saccadic events only occur a few times per second), we can apply an adaptive threshold to Display Formula\(\left| {{\theta _1}} \right|\) in order to solve the problem of over-fitting. By calculating the median of all the estimated Display Formula\(\left| {{\theta _1}} \right|\), Display Formula\({\boldsymbol {median}}\left[ {\left| {{{\hat \bitheta }_{1ML}}} \right|} \right]\), where Display Formula\({\hat \bitheta _{1ML}}\) is a vector of all estimated θ1 of the input sequence Display Formula\({Z_{{k_0}:{k_1}}}\), we can estimate the median value of Display Formula\(\left| {{\theta _1}} \right|\) when no saccadic event has occurred. By multiplying a factor Display Formula\({T_{{\theta _1}}}\), we can reject the hypothesis Display Formula\({{\cal H}_1}\) regardless of T and ρ if Display Formula\({\hat \theta _{ML}} \ge {T_{{\theta _1}}}\;{\boldsymbol {median}}\left[ {\left| {{{\hat \bitheta }_{1ML}}} \right|} \right]\). The aforementioned reasoning explains steps 13 through 15 in Algorithm 2, and Display Formula\({\hat{\theta_1}_{ML}}^{(i)}\) stands for the ith element in Display Formula\(\hat{\theta_1}_{ML}\)
The Step 17 postprocessing is necessary because the hypothesis testing is statistically accurate if the testing data either follows Display Formula\({{\cal H}_0}\) or Display Formula\({{\cal H}_1}\). However, given the windowing method as in Algorithm 2 Step 5, we cannot guarantee that the vector Display Formula\({Z_{i:i + L - 1}}\) precisely follows state Display Formula\({{\cal H}_0}\) or Display Formula\({{\cal H}_1}\). We usually pick the value L to around the maximum possible duration of a microsaccade. Yet, we cannot choose a value too large, otherwise two or more microsaccades can be entirely enclosed in vector Display Formula\({Z_{i:i + L - 1}}\). The five common conditions that we can encounter are listed below: 
  •  
    Display Formula\({Z_{i:i + L - 1}}\) follows Display Formula\({{\cal H}_0}\)
  •  
    An entire microsaccade is enclosed in Display Formula\({Z_{i:i + L - 1}}\) and the microsaccade start at time j = i
  •  
    An entire microsaccade is enclosed in Display Formula\({Z_{i:i + L - 1}}\) and the microsaccade start at time j > i
  •  
    A section of a microsaccade is enclosed in Display Formula\({Z_{i:i + L - 1}}\) and the microsaccade start at time j < i
  •  
    A section of a microsaccade is enclosed in Display Formula\({Z_{i:i + L - 1}}\) and the microsaccade end at time j > i + L − 1
The Conditions a and b are ideal scenarios, such that the two state model captures the behavior of the data points flawlessly. However, for Conditions c, d, and e, the data points only partially fit our model, and the resulting behavior of the detectors need to be analyzed individually under simulation. By using both simulated data as well as actual data, we find that under Conditions c and e, NPD tends to reject Display Formula\({{\cal H}_1}\) and accept Display Formula\({{\cal H}_0}\). For Condition d, NPD almost always accept Display Formula\({{\cal H}_1}\) as a sectioned microsaccade as Condition d also fits the model in Display Formula\({{\cal H}_1}\), yet the resulting estimated amplitude is reduced. Based on the behavior of NPD, we can perform the post processing of Display Formula\({J_{{k_0}:{k_1} - L + 1}}\) and Display Formula\({\hat \theta _{{k_0}:{k_1} - L + 1}}\) as follows: (1) Apply a median filter of strength M to Display Formula\({J_{{k_0}:{k_1} - L + 1}}\), such that only consecutively detected events are kept. (2) In each of the consecutive all 1 regions, find the starting time of the microsaccade based on the position that yields the minimal false alarm probability, which can be calculated as:  
\begin{equation}\tag{36}{P_{FA}} = 1 - F\left( {\root {} \of T } \right).\end{equation}
 
To compute the MLE of θ efficiently, the search of candidate parameter pairs (θ2, θ3) can be accelerated if we impose more constraints on (θ2, θ3). One such constraint is that the parameter pair satisfies a maximum velocity and occurs within a specific duration range. These constraints on (θ2, θ3) significantly accelerate the computation. 
Results
In this section, results from simulated data as well as actual data are presented. We start by testing the parameter estimation accuracy of the MLE method for θ, as well as for vmax, tv, and d, using simulated data where we know the ground truth. The receiver operating characteristic (ROC) curve of the NPD with simulated data is also presented. Then, we compare the proposed detector with the state-of-the-art CNN method, in terms of the area under the curve (AUC) of the ROC plot, and also in terms of the measurement sensitivity index. We then compare the NPD and the BIM methods to evaluate parameter estimation accuracy in terms of the linearity of the main sequence. 
Test with simulated data
The aim of this section is to test our MLE and NPD method with known parameters and ground truth. In our models, we have five parameters Display Formula\(\sigma _v^2\), Display Formula\(\sigma _w^2\), θ1, θ2, and θ3. A thorough test of all combinations between these parameters is not practical. Therefore, we exploit the measurement, signal-to-noise ratio (SNR), to capture the joint behavior of all parameters. The SNR, in most scenarios, directly affects the performance of statistical estimation and detection. 
Data generation is the first step before testing our model. We set k0 = 0, k1 = 100, θ1 = 1, θ2 = 40, θ3 = 40, and Display Formula\(\sigma _w^2 = 0.0049\). The SNR is controlled by Display Formula\(\sigma _v^2\). We pick three values for Display Formula\(\sigma _v^2\), which are 0, 0.01, and 0.09. Indeed σw and σv were defined earlier, in Equation 1 but, as a reminder, it makes sense to fix the parameter σw as it relates to an intrinsic oculomotor process and this value captures it well. Note the value of σw is only fixed in this section for simulation, and is estimated from real data for the detection tasks. However, we can change the value of Display Formula\(\sigma _v^2\) based on the eye-tracking technology we are using. The resulting SNR with Display Formula\(\sigma _v^2 = \) 0, 0.01, and 0.09, are 23, 18, and 10 dB, respectively. Based on the specified parameter settings, for each Display Formula\(\sigma _v^2\), we generate 2,000 independent trials. Among the 2,000 trials, we randomly set each trial to follow state Display Formula\({{\cal H}_0}\) or Display Formula\({{\cal H}_1}\) with equal probability. 
We first test our Algorithm 2 with all parameters provided to the algorithm. Hence, the resulting detection is theoretically optimum under the NPD framework. Figure 4 shows the resulting empirical and theoretical distributions for test statistics T under state Display Formula\({{\cal H}_0}\) and Display Formula\({{\cal H}_1}\). We can see that in Figure 4a, such that SNR = 23 dB, the two distributions of T under state Display Formula\({{\cal H}_0}\) and Display Formula\({{\cal H}_1}\) are highly separable, by picking a threshold at the value of 25. However, with deceases of SNR, such as in Figure 4c, we can observe an overlap between the two distributions. Hence, some trade-offs are needed before we apply a threshold to differentiate the two distributions. The ROC curves for the detection results are shown in Figure 5. We can say that a detector is better if its corresponding ROC curve is further away from the diagonal dotted line (chance line) in Figure 5a, which represents the random guess. From Equations 34 and 35, we can calculate the theoretical ROC curves; both empirical curves (solid lines) and theoretical curves (dashed lines) are plotted in Figure 5. We can see that NPD is very robust to noise, and all empirical ROC curves are very close to the theoretical values. Figure 5b is the zoomed plot of Figure 5a to show the performance of the NPR when α ∈ [0, 0.025]. 
Figure 4
 
The distribution of statistics T with different SNR.
Figure 4
 
The distribution of statistics T with different SNR.
Figure 5
 
ROC curves for Algorithm 2 with known parameters.
Figure 5
 
ROC curves for Algorithm 2 with known parameters.
With the same simulated dataset, we then test our MLE method for estimating θ as well as d, tv, and vmax. Only trials with state Display Formula\({{\cal H}_1}\) are used to perform the estimation of θ. Then based on estimated θ, we employ Equations 9, 10, and 6 to estimate tv, vmax, and d respectively. We measure the performance of the estimator via the mean squared error (MSE). Table 1 shows the parameter estimation results, where MSE for θ1, θ2, and θ3 are unit free, MSE for d and tv have unit of ms2, and MSE for vmax has unit (°/ms)2. We can see from Table 1 that the estimation of θ3 has the largest inaccuracy. However, despite the large inaccuracy in the estimated θ3, the estimated d, vmax, and tv are still accurate. 
Table 1
 
Mean squared error of the maximum likelihood estimated parameters.
Table 1
 
Mean squared error of the maximum likelihood estimated parameters.
Test with real data
In this section, we demonstrate the outputs of our detector using real data, and we perform a comparison between the NPD with the CNN method and the BIM. For the CNN method, the state-of-the-art, we aim to compare the detector performance in terms of the AUC of the ROC plot. For the BIM, the best known statistics based approach, we aim to compare the parameter estimation accuracy (maximum velocity and amplitude), through the main sequence. 
Demonstration of NPD outputs
The output of the NPD given a trial of inputs (X and Y components of the eye-gaze position) consists of the following components: (a) estimated θ for each windowed subsection of the input data points; (b) detection result (a binary sequence, where 0 indicates the absence of a saccadic event, and 1 indicates the existence of a saccadic event); (c) the best fitted θ1, θ2, and θ3 for each detected event and the corresponding parameters, i.e., t1: the starting point of the event; t2: the end point of the event; d: duration of the event; vmax: maximum velocity of the saccadic eye movement; and tv: the time index of reaching vmax
Since most existing detection algorithms only produce the binary detection results, our method is more immediately informative about the detected saccadic eye movements. With all the estimated parameters for each of the detected saccadic events, we can plot the noise-free saccadic eye movement on top of the noisy eye-gaze position measurements. Figure A1 demonstrates some of the NPD outputs. Figure A1a and A1c show the X and Y components of the raw gaze signal (purple lines); we can observe two saccadic events. The bold lines (blue and red) reproduce the saccadic eye movement (noise-free) based on the estimated θ. The circles and crosses in Figure A1a and A1c indicate the starting and ending points of the event based on the definition discussed in the Statistical modeling of gaze data section. The stars indicate the positions of maximum velocity and their corresponding saccadic events. We also calculate the value of the maximum velocity, and we indicate vmax in Figure A1a and A1c. Figure A1b and A1d show the estimated θ1. We can see that the estimated θ1 does not depend on the drifting movement of the eye, as it reverts around the zero line when no saccadic event has occurred. 
Comparison with the CNN method
The CNN method is the state-of-the-art approach in detecting saccadic events, having “human-level performance” (Bellet et al., 2018). The comparison is performed by using the public dataset in Bellet et al. (2018), and Table 2 summarizes the data information for testing the performance of the NPD as compared with the CNN. We can see that for Datasets A, C, and D, the EyeLink 1000 gaze tracker (SR Research Ltd, Ottawa, Canada) is used, and for Dataset B, the search coil is used to track the gaze position. It is worth noting that Dataset B is very different from the rest of the datasets, not only due to the difference in tracking method, but also due to the difference in the underlying eye movement. Catch-up saccades (saccades during smooth pursuit) are the main saccadic event in Dataset B. The catch-up saccades exhibit different characteristics from the saccades at fixation, and therefore they are outside the application scope of our proposed detection method. However, the detection results are also presented with Dataset B for reference purposes. All saccadic events in the dataset are manually labeled by experts. Thus, we can plot the ROC curve to demonstrate the detection results. Among the four datasets, we exclude all trials that have missing values in the measurements, which may be due to the temporary loss of eye features while tracking. We also discard trials where a saccadic event is truncated at the beginning or end of the trial. For the CNN method, we use the online server provided by Bellet et al. (2018) to perform the detection. Among several of the already generated weights provided by Bellet et al. (2018) for the CNN, we select the pretrained “combined weights”, which results from the largest number of training data. More specifically, the “combined weights”, are trained using the data collected during the same experiments that collected Datasets A, B, and C. The data trails for training and testing are mutually exclusive. For our NPD, we randomly choose values for L, M, α, and Display Formula\({T_{{\theta _1}}}\) given an entire dataset. Some combinations of the parameters are superior to others, and the ROC curve is plotted by picking the upper envelope of the PD, PFA value pairs. Based on the NPD performances by using the randomly selected parameters, we acquired insights regarding the proper choices of these parameters. The proper choices of parameters for all the datasets share the same properties, and we discuss it in the Discussion and conclusion section. 
Table 2
 
Summary of data information (for testing) in Bellet et al. (2018).
Table 2
 
Summary of data information (for testing) in Bellet et al. (2018).
Figure A2 shows the ROC curves (full ROC curves on the left, and zoomed curves for PFA ∈ [0, 0.2] and PD ∈ [0.48, 1] on the right) of the detection results for each dataset, and Table 3 summarizes the AUC for each detectors. We can see that other than Dataset B, the NPD yields larger AUC value than that of the CNN method. Therefore, we conclude that our NPD outperforms the CNN method when the dataset does not contain catch-up saccades. Besides AUC, we also use the measurement sensitivity index, denoted as d′, to compare the detector performance. d′ reflects the discriminating ability of the target signal with contamination noise (Macmillan & Creelman, 2004). A larger d′ represents a stronger discriminating power of the detector. Table 3 shows the sensitivity index of each detector for all datasets. The results suggested by the sensitivity index agree with the measurement AUC in Table 3
Table 3
 
Area under the curve (AUC) and sensitivity index for each dataset of the convolutional neural networks (CNN) with pretrained “combined weights” and NPD methods.
Table 3
 
Area under the curve (AUC) and sensitivity index for each dataset of the convolutional neural networks (CNN) with pretrained “combined weights” and NPD methods.
The reasons behind the performance differences between the two detectors can be analyzed as follows. With Dataset B, based on the fact that our statistical testing is constructed without considering smooth pursuit movement, we expect to observe inferior performance. In that sense, the CNN is more versatile in detecting different kinds of saccadic events. However, for the CNN, we notice that the AUC and the sensitivity index are relatively stable across Datasets A, B, and C. But for Dataset D, the performance of the CNN drops substantially, as reflected by the much smaller AUC and sensitively index values. The CNN exhibits poorer performance on Dataset D because the experiments in D were not included in the training dataset. This is consistent with poorer performance of CNNs for unseen datasets, as expected for such data-driven approaches. The NPD, on the other hand, is relatively stable across different datasets (A, C, and D), as indicated by similar AUC and sensitivity index values. The stable performance of the NPD on Datasets A, C, and D partially results from the structured algorithm steps: the parameter estimation step and then the Neyman-Pearson decision-making step. The parameter estimation step is explicitly included in our NPD algorithm for the handling of data differences. That is, we can parameterize data differences in terms of σw, σv, and θ before producing the detection outcome. 
Comparison with the BIM
We now perform the comparison with BIM using dataset in Mihali et al. (2017). It contains five sequences of tracked gaze positions from five participants, and Table 4 shows the detail. In the Data info column in Table 4, the number indicates the participant number, and the letter “X” or “Y” indicates the horizontal or vertical component of the gaze position, respectively. Following the convention used in Martinez-Conde et al. (2009), such that the main sequence of detected microsaccade events are plotted to guarantee the validity of the detected events, we also plot the main sequence as a testing criteria for the parameter estimation accuracy for the detected microsaccades. The main sequence is known as the log-linear relationship between the amplitude and the maximum velocity of microsaccades events (Zuber et al., 1965). Figure A3 shows the main sequence of the detected microsaccades using both detection methods, where R denotes the R2 score for linear regression. A detailed detection result is summarized in Table 5, where N stands for the number of events detected under different detection methods, and Na is the adjusted number of detected events for NPD. Na is calculated as the total number of microsaccades of both eyes minus the number of binocularly occurred microsaccades of one individual, such that it reflects the same quantity as calculated by N (BIM). Due to the high noise level of Dataset 4X through 5Y, as reflected by large Display Formula\({\hat \sigma _v}\) and Display Formula\({\hat \sigma _w}\) compared to the rest, the NPD cannot detect any event. Therefore, the main sequence from NPD for Participants 4 and 5 cannot be plotted for the comparison with BIM. 
Table 4
 
Data information in Mihali et al. (2017) and NPD parameter estimates.
Table 4
 
Data information in Mihali et al. (2017) and NPD parameter estimates.
Table 5
 
Summary of the detection results.
Table 5
 
Summary of the detection results.
By observing the results shown in Table 5 for NPD and the estimated σv and σw from each dataset, we can notice that NPD can detect more events when noise is small. That is due to the false alarm probability constraint imposed within NPD once we set α. When noise is huge (σv and σw are large), even the true saccadic events may not pass the test, thus reducing the total amount of detected events for NPD. The advantage of this mechanism is that it guarantees that all detected events are reliable at a significance level α. However, this may lead to the decrease of the probability of detection as noise level increases. To counteract the decrease of PD, we can increase the value of α, at a cost of increasing PFA. BIM, on the contrary, does not encounter this kind of problem, as it can detect more or less the same amount of events regardless of the noise level. But, we cannot know to what extent that a detected event from BIM is a false alarm. 
Discussion and conclusion
In this article, we performed a detailed study of the microsaccade detection, where a novel statistical model for fixational eye movement is built, and the NPD is derived based on the model. When compared with the state-of-the-art CNN method, our NPD yields a larger AUC value, implying that for certain datasets, our NPD can better detect saccadic eye movements. As for the comparison with the BIM method, we claim that our model is statistically more accurate, and the detected events are less likely to be false alarms in comparison with the BIM. The results also suggest that our NPD can detect both the microsaccades and the normal saccades. 
On the one hand, based on the simulation and testing with real datasets, there are still points of improvements in our NPD. The first is that the computational complexity is high. In the developed algorithm of the NPD, the OpenMP library is used to enable parallel computing for computation acceleration. More advanced optimization or approximation methods are needed to further reduce the computational burden of the NPD. The second comes from the tuning of the parameters of the NPD. As a parametric method, the performance of the NPD relies on the proper choice of the parameters (i.e., M [the strength of the median filter], L [the section length for detection], α, and Display Formula\({T_{{\theta _1}}}\)). When inputting datasets acquired at different sampling frequency, or with different measurement noise, we usually need to adjust some of the parameters in the NPD. A set of inappropriately selected parameters of the NPD may hinder the optimal performance of the detector. Based on empirical experiences, in Table 6, we summarize the proper choices of these parameters for dataset with 1000 Hz sampling rate. For dataset of different sampling rates, the choices for M and L can be adjusted proportionally to the sampling rate difference. Finally, when dealing with data with missing values, or when catch-up saccades are present in the dataset, the NPD is less reliable. To extend our NPD to a method that is capable of dealing with more application scenarios, a more complex statistical model needs to be developed. The length of the dataset is also worth mentioning. Since we need to perform parameter estimation for σv and σw based on the data itself, a longer (more than 1,000 data points) trial of gaze signal is preferred for parameter estimation accuracy. 
Table 6
 
Suggested parameter ranges based on empirical experience (for gaze signal sampled at 1000 Hz).
Table 6
 
Suggested parameter ranges based on empirical experience (for gaze signal sampled at 1000 Hz).
On the other hand, comparing to previous detection methods in the literature, our approach has the following advantages. First of all, it is a statistical method and therefore, all estimations and detection results are statistically valid and accurate. With the Neyman Pearson lemma, we can have a confidence bound for our detection results given α. The second is the online applicability. Different from the BIM or CNN methods, which cannot be implemented online, our detection method is strictly causal, and the detection can be made once a microsaccade has finished. One modification needed based on our proposed algorithm is that we need to introduce a calibration phase at the beginning for the estimation of σv, σw, and Display Formula\({\boldsymbol {median}}\left[ {|{\bitheta _1}_{ML}|} \right]\). Once the estimation is finished, Algorithm 2 can be used online, for the purpose of HCI or for more advanced cognitive studies and research. Last but not the least, the NPD can inform the user with a more complete profile of the detected events. The associated characteristics of a saccadic event, including duration, amplitude, and maximum velocity, are provided to the user. 
Acknowledgments
This research was funded by the Natural Sciences and Engineering Research Council of Canada. 
Commercial relationships: none. 
Corresponding author: Hongzhi Zhu. 
Email: hzhu@ece.ubc.ca
Address: Department of Biomedical Engineering, University of British Columbia, Vancouver, BC, Canada. 
References
Adler, F. H., & Fliegelman, M. (1934). Influence of fixation on the visual acuity. Archives of Ophthalmology, 12 (4), 475–483.
Bellet, M. E., Bellet, J., Nienborg, H., Hafed, Z. M., & Berens, P. (2018). Human-level saccade detection performance using deep neural networks. Journal of Neurophysiology, 121 (2), 646–661.
Bengi, H., & Thomas, J. (1968). Three electronic methods for recording ocular tremor. Medical and Biological Engineering, 6 (2), 171–179.
Casella, G., & Berger, R. L. (2002). Statistical inference (vol. 2). Pacific Grove, CA: Duxbury.
Chennamma, H., & Yuan, X. (2013). A survey on eye-gaze tracking techniques. arXiv:1312.6410, https://arxiv.org/ftp/arxiv/papers/1312/1312.6410.pdf.
Ciuffreda, K. J., & Tannen, B. (1995). Eye movement basics for the clinician. Maryland Heights, MO: Mosby.
Daye, P. M., & Optican, L. M. (2014). Saccade detection using a particle filter. Journal of Neuroscience Methods, 235, 157–168.
Deubel, H., & Bridgeman, B. (1995). Fourth Purkinje image signals reveal eye-lens deviations and retinal image distortions during saccades. Vision Research, 35 (4), 529–538.
Dowling, J. E., & Dowling, J. L. (2016). Vision: How it works and what can go wrong. Cambridge, MA: MIT Press.
Engbert, R. (2006). Microsaccades: A microcosm for research on oculomotor control, attention, and visual perception. Progress in Brain Research, 154, 177–192.
Engbert, R., & Kliegl, R. (2004). Microsaccades keep the eyes' balance during fixation. Psychological Science, 15 (6), 431–431.
Han, P., Saunders, D. R., Woods, R. L., & Luo, G. (2013). Trajectory prediction of saccadic eye movements using a compressed exponential model. Journal of Vision, 13 (8): 27, 1–13, https://doi.org/10.1167/13.8.27. [PubMed] [Article]
Helstrom, C. W. (1971). Quantum detection theory. Reproduced by the National Technical Information Service, Springfield, VA.
Ko, H.-K., Snodderly, D. M., & Poletti, M. (2016). Eye movements between saccades: Measuring ocular drift and tremor. Vision Research, 122, 93–104.
Lagarias, J. C., Reeds, J. A., Wright, M. H., & Wright, P. E. (1998). Convergence properties of the Nelder–Mead simplex method in low dimensions. SIAM Journal on optimization, 9 (1), 112–147.
Macmillan, N. A., & Creelman, C. D. (2004). Detection theory: A user's guide. Psychology Press.
Martinez-Conde, S., Macknik, S. L., Troncoso, X. G., & Hubel, D. H. (2009). Microsaccades: A neurophysiological analysis. Trends in Neurosciences, 32 (9), 463–475.
Martinez-Conde, S., Otero-Millan, J., & Macknik, S. L. (2013). The impact of microsaccades on vision: Towards a unified theory of saccadic function. Nature Reviews Neuroscience, 14 (2), 83.
Matin, L., Matin, E., & Pearce, D. G. (1970). Eye movements in the dark during the attempt to maintain a prior fixation position. Vision Research, 10 (9), 837–857.
Mihali, A., van Opheusden, B., & Ma, W. J. (2017). Bayesian microsaccade detection. Journal of Vision, 17 (1): 13, 1–23, https://doi.org/10.1167/17.1.13. [PubMed] [Article]
Otero-Millan, J., Castro, J. L. A., Macknik, S. L., & Martinez-Conde, S. (2014). Unsupervised clustering method to detect microsaccades. Journal of Vision, 14 (2): 18, 1–17, https://doi.org/10.1167/14.2.18. [PubMed] [Article]
Otero-Millan, J., Macknik, S. L., Langston, R. E., & Martinez-Conde, S. (2013). An oculomotor continuum from exploration to fixation. Proceedings of the National Academy of Sciences, 110 (15), 6175–6180.
Rolfs, M. (2009). Microsaccades: Small steps on a long way. Vision Research, 49 (20), 2415–2441.
Ruhland, K., Peters, C. E., Andrist, S., Badler, J. B., Badler, N. I., Gleicher, M., … McDonnell, R. (2015). A review of eye gaze in virtual agents, social robotics and HCI: Behaviour generation, user interaction and perception. In Computer graphics forum (vol. 34, pp. 299–326).
Sheynikhovich, D., Bécu, M., Wu, C., & Arleo, A. (2018). Unsupervised detection of microsaccades in a high-noise regime. Journal of Vision, 18 (6): 19, 1–16, https://doi.org/10.1167/18.6.19. [PubMed] [Article]
Sinn, P., & Engbert, R. (2016). Small saccades versus microsaccades: Experimental distinction and model-based unification. Vision Research, 118, 132–143.
Van Opstal, A., & Van Gisbergen, J. (1987). Skewness of saccadic velocity profiles: A unifying parameter for normal and slow saccades. Vision Research, 27 (5), 731–745.
Van Trees, H. L. (2004). Detection, estimation, and modulation theory, Part I: detection, estimation, and linear modulation theory. John Wiley & Sons.
Wandell, B. A. (1995). Foundations of vision (vol. 8). Sunderland, MA: Sinauer Associates.
Zhu, H. (2018). Statistical modeling and tracking of artery wall motion in ultrasound images (Unpublished doctoral dissertation). McGill University, Montreal, Canada.
Zhu, H., Salcudean, S. E., & Rohling, R. N. (2019). A novel gaze-supported multimodal human–computer interaction for ultrasound machines. International Journal of Computer Assisted Radiology and Surgery, 14 (7), 1107–1115.
Zuber, B., Stark, L., & Cook, G. (1965, December 10). Microsaccades and the velocity-amplitude relationship for saccadic eye movements. Science, 150 (3702), 1459–1460.
Appendix
Derivation of Equations 27 and 29
From Equation 26, we have:  
\begin{equation}\tag{37}\ln f\left( {{C_{{k_2}:{k_1}}}{\rm{|}}\bitheta ,{{\cal H}_1}} \right) = - {1 \over 2}\ln \det \left( {2\pi \bSigma } \right) - {1 \over 2}\left[ {{{\left( {{C_{{k_2}:{k_1}}} - \bimu } \right)}^T}} {\cdot{\bSigma ^{ - 1}}\left( {{C_{{k_2}:{k_1}}} - \bimu } \right)} \right]\end{equation}
and we have  
\begin{equation}\tag{38}{\left( {{C_{{k_2}:{k_1}}} - \bimu } \right)^T}\cdot{\bSigma ^{ - 1}}\left( {{C_{{k_2}:{k_1}}} - \bimu } \right) = {C_{{k_2}:{k_1}}}^T{\bSigma ^{ - 1}}{C_{{k_2}:{k_1}}} - 2{C_{{k_2}:{k_1}}}^T{\bSigma ^{ - 1}}\bimu + {\bimu ^T}{\bSigma ^{ - 1}}\bimu \end{equation}
 
By substituting Equation 38 to Equation 37, and dropping constant terms, we can simplify Equation 27 as:  
\begin{equation}\tag{39}{\hat \bitheta _{ML}} = \arg \mathop {\max }\limits_\bitheta \left( {2{\bimu ^T}{\bSigma ^{ - 1}}{C_{{k_2}:{k_1}}} - {\bimu ^T}{\bSigma ^{ - 1}}\bimu } \right).\end{equation}
 
Since for μ, the parameter θ1 is a scaling factor, we can define  
\begin{equation}\tag{40}\bar \bimu = {1 \over {{\theta _1}}}\bimu \end{equation}
such that each element in Display Formula\(\bar \bimu \) is independent of θ1:  
\begin{equation}\tag{41}{{\bar \mu }_i} = \exp \left[ { - {{\left( {{{{t_1} + i - 1} \over {{\theta _2}}}} \right)}^{{\theta _3}}}} \right] - \exp \left[ { - {{\left( {{{{t_1} + i} \over {{\theta _2}}}} \right)}^{{\theta _3}}}} \right] \end{equation}
where Display Formula\({\bar \mu _i}\) is the ith element in Display Formula\(\bar \bimu \). Therefore, we can rewrite Equation 28 as:  
\begin{equation}\tag{42}{{\partial \ln f\left( {{C_{{k_2}:{k_1}}}|\bitheta ,{{\cal H}_1}} \right)} \over {\partial {\theta _1}}} = {\partial \over {\partial {\theta _1}}}\left[ {2{\theta _1}{{\bar \bimu }^T}{\bSigma ^{ - 1}}{C_{{k_2}:{k_1}}} - {\theta _1}^2{{\bar \bimu }^T}{\bSigma ^{ - 1}}\bar \bimu } \right] = 2{{\bar \bimu }^T}{\bSigma ^{ - 1}}{C_{{k_2}:{k_1}}} - 2{\theta _1}{{\bar \bimu }^T}{\bSigma ^{ - 1}}\bar \bimu = 0 \end{equation}
and, solving it, we can come to Equation 29 as:  
\begin{equation}\tag{43}{\hat \theta _{1ML}} = {{{{\bar \bimu }^T}{\bSigma ^{ - 1}}{C_{{k_2}:{k_1}}}} \over {{{\bar \bimu }^T}{\bSigma ^{ - 1}}\bar \bimu }}.\end{equation}
 
Figure A1
 
Demonstration of the NPD output.
Figure A1
 
Demonstration of the NPD output.
Figure A2
 
ROC curves of NPD and CNN for each dataset. (a) Full ROC dataset A. (b) Zoomed ROC plot dataset A. (c) Full ROC dataset B. (d) Zoomed ROC plot dataset B. (e) Full ROC dataset C. (f) Zoomed ROC plot dataset C. (g) Full ROC dataset D. (h) Zoomed ROC plot dataset D.
Figure A2
 
ROC curves of NPD and CNN for each dataset. (a) Full ROC dataset A. (b) Zoomed ROC plot dataset A. (c) Full ROC dataset B. (d) Zoomed ROC plot dataset B. (e) Full ROC dataset C. (f) Zoomed ROC plot dataset C. (g) Full ROC dataset D. (h) Zoomed ROC plot dataset D.
Figure A3
 
Main sequence with NPD and BIM.
Figure A3
 
Main sequence with NPD and BIM.
Figure 1
 
Generalized exponential function from Equation 2 and the interpretation of parameters in Equations 3, 4, 5, 6, 9, and 10. Comparing Functions 1 (in blue) and 2 (in red), we notice that θ2 controls onset of the saccadic event. Comparing Functions 2 (in red) and 3 (in yellow), we notice that θ3 controls the abruptness of the amplitude change.
Figure 1
 
Generalized exponential function from Equation 2 and the interpretation of parameters in Equations 3, 4, 5, 6, 9, and 10. Comparing Functions 1 (in blue) and 2 (in red), we notice that θ2 controls onset of the saccadic event. Comparing Functions 2 (in red) and 3 (in yellow), we notice that θ3 controls the abruptness of the amplitude change.
Figure 2
 
Demonstration of a simulated gaze position measurements and the two states \({{\cal H}_0}\) and \({{\cal H}_1}\).
Figure 2
 
Demonstration of a simulated gaze position measurements and the two states \({{\cal H}_0}\) and \({{\cal H}_1}\).
Figure 3
 
Demonstration of \(\sigma _w^2\) and \(\sigma _v^2\) estimation.
Figure 3
 
Demonstration of \(\sigma _w^2\) and \(\sigma _v^2\) estimation.
Figure 4
 
The distribution of statistics T with different SNR.
Figure 4
 
The distribution of statistics T with different SNR.
Figure 5
 
ROC curves for Algorithm 2 with known parameters.
Figure 5
 
ROC curves for Algorithm 2 with known parameters.
Figure A1
 
Demonstration of the NPD output.
Figure A1
 
Demonstration of the NPD output.
Figure A2
 
ROC curves of NPD and CNN for each dataset. (a) Full ROC dataset A. (b) Zoomed ROC plot dataset A. (c) Full ROC dataset B. (d) Zoomed ROC plot dataset B. (e) Full ROC dataset C. (f) Zoomed ROC plot dataset C. (g) Full ROC dataset D. (h) Zoomed ROC plot dataset D.
Figure A2
 
ROC curves of NPD and CNN for each dataset. (a) Full ROC dataset A. (b) Zoomed ROC plot dataset A. (c) Full ROC dataset B. (d) Zoomed ROC plot dataset B. (e) Full ROC dataset C. (f) Zoomed ROC plot dataset C. (g) Full ROC dataset D. (h) Zoomed ROC plot dataset D.
Figure A3
 
Main sequence with NPD and BIM.
Figure A3
 
Main sequence with NPD and BIM.
Table 1
 
Mean squared error of the maximum likelihood estimated parameters.
Table 1
 
Mean squared error of the maximum likelihood estimated parameters.
Table 2
 
Summary of data information (for testing) in Bellet et al. (2018).
Table 2
 
Summary of data information (for testing) in Bellet et al. (2018).
Table 3
 
Area under the curve (AUC) and sensitivity index for each dataset of the convolutional neural networks (CNN) with pretrained “combined weights” and NPD methods.
Table 3
 
Area under the curve (AUC) and sensitivity index for each dataset of the convolutional neural networks (CNN) with pretrained “combined weights” and NPD methods.
Table 4
 
Data information in Mihali et al. (2017) and NPD parameter estimates.
Table 4
 
Data information in Mihali et al. (2017) and NPD parameter estimates.
Table 5
 
Summary of the detection results.
Table 5
 
Summary of the detection results.
Table 6
 
Suggested parameter ranges based on empirical experience (for gaze signal sampled at 1000 Hz).
Table 6
 
Suggested parameter ranges based on empirical experience (for gaze signal sampled at 1000 Hz).
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×