**Abstract**:

**Abstract**
**Despite the growing popularity of virtual reality environments, few laboratories are equipped to investigate eye movements within these environments. This primer is intended to reduce the time and effort required to incorporate eye-tracking equipment into a virtual reality environment. We discuss issues related to the initial startup and provide algorithms necessary for basic analysis. Algorithms are provided for the calculation of gaze angle within a virtual world using a monocular eye-tracker in a three-dimensional environment. In addition, we provide algorithms for the calculation of the angular distance between the gaze and a relevant virtual object and for the identification of fixations, saccades, and pursuit eye movements. Finally, we provide tools that temporally synchronize gaze data and the visual stimulus and enable real-time assembly of a video-based record of the experiment using the Quicktime MOV format, available at http://sourceforge.net/p/utdvrlibraries/. This record contains the visual stimulus, the gaze cursor, and associated numerical data and can be used for data exportation, visual inspection, and validation of calculated gaze movements.**

^{1}with a field of view that stretches 102° along the horizontal and 64° along the vertical. A 14 camera Phasespace X2 motion capture system

^{2}running at 480 Hz tracked head and body movement within a capture volume with submillimeter accuracy.

^{3}tracks the eye within the HMD and is accompanied by Viewpoint software version 2.9.2.6. The eye-tracker refresh rate is restricted by the NTSC standard and functions at a frequency of approximately 60 frames/second when the image is de-interlaced. Although the process of de-interlacing is accompanied by some loss of resolution, it is typical of all modern eye-tracking suites to have adopted algorithms that are robust to this degradation.

*field of view*, which is denoted simply as “fov”). Most vectors and matrices and some scalar values have a subscript indicating the frame of reference. For example, the position of the eye within the eye/screen reference frame is = [000]

^{T}. We use a circumflex to distinguish a unit length direction vector from a point in space. The gaze direction within the virtual world is . At times we add additional subscripts to specify a particular scalar element from a vector (e.g., the height of the eye within the world would be

*e*). We also use the subscript notation to indicate the value of a variable from a particular point in time

_{Wz}*t*when necessary.

*x*-axis is to the right of the eye, while the

*y*-axis is up, relative to the eye. The eye looks along the negative

*z*-axis. The programmer parametrically specifies an appropriate field-of-view for the eye that is used by OpenGL to calculate the mapping from a point in the virtual world to pixel coordinates when displayed inside the HMD. Thus, the rendering process may be thought of as a conversion from features in the virtual world to locations within the capture volume, and then to pixel locations on the HMD display. Calculating the gaze-in-world vector reverses this rendering process. It is important, therefore, to be sure that the relationships between the various frames of reference are known and available when processing gaze data.

*F*is a placeholder used to indicate which frame of reference the matrix or point uses.

*, pix*

_{x}*) pixels. Here, we discuss how to convert pixel coordinates into an angular measure of gaze location in which an eye centered in front of the screen will have horizontal and vertical visual angles of (θ*

_{y}*= 0, φ*

_{s}*= 0) when looking through the central pixel [(pix*

_{s}*/2), (pix*

_{x}*/2)]. This same center pixel expressed in terms of the angular height and width of the field-of-view and measured in visual degrees along the horizontal and vertical screen axes is (fov*

_{y}*, fov*

_{x}*). Rearranging the basic trigonometric relationship tan(fov*

_{y}*/2) = pix*

_{x}*/(2*

_{x}*d*), yields a distance from the eye to the center of the screen,

*d*= pix

*/[2tan(fov*

_{x}*/2)]. Given some pixel coordinate (*

_{x}*x*,

*y*), the screen-relative visual angles are equal to the inverse tangent of the ratio between the pixel distance from this point and the center of the screen along its respective axis, and the distance of the eye from the screen. Rearranged for clarity, these angles are and

*x*,

*y*) to a point, , on a fictitious plane one unit in front of the eye in eye-relative coordinates (where directions are defined relative to the screen). Referring to Figure 4, we see that the

*x*-coordinate of this point is

*g*= tan(θ

_{Ex}*). The formula for the*

_{s}*y*-coordinate is equivalent and the

*z*-coordinate is, trivially, one unit ahead. Substituting in the formulas for θ

*, φ*

_{s}*, and*

_{s}*d*from Equation 2, we end up with the following formula for an eye-relative point on a plane one unit in front of the eye:

*y*increases downward, but in eye space,

*y*increases upward. The third element is negative because, as discussed earlier, the gaze direction lies along the negative

*z*-axis in eye space.

*ρ*degrees (e.g., in the NVis SX111,

*ρ*= 13°), in which case one must take an additional step to ensure accurate placement of the gaze vector within the virtual world. Note that the frame of reference is otherwise unchanged. In the case that one is using an HMD with screens that are not fronto-parallel, one must take the additional step to transform into HMD direction space from eye space of multiplying with a rotation matrix to rotate the vector around the eye's

*y*-axis (see Figure 5). For the left eye, this matrix is as follows:

*= [*

_{C}*q*]. When used to represent a spatial orientation, a quaternion encodes a rotation by

_{w}q_{x}q_{y}q_{z}*ϑ*around an arbitrary unit axis, [

*x y z*], as

*q*= [cos(

*ϑ*/2)

*x*sin(

*ϑ*/2)

*y*sin(

*ϑ*/2)

*z*sin(

*ϑ*/2)]. Using standard formulas from the graphics community (Horn, 1987), we create a rotation matrix from this quaternion to find gaze direction within the capture volume coordinates established at calibration time:

*) and yaw (θ*

_{W}*). We define*

_{W}*pitch*as the angle between a vector and the ground plane with positive angles corresponding to the vector pointing upward. The angle between the forward (positive

*y*) axis and the projection of a vector onto the ground plane is

*yaw*, with positive angles to the right. Yaw is undefined if the vector is perpendicular to the ground plane. One can extract the individual components of pitch and yaw from the world-based gaze vector: and

*x*-axis:

**=**

*γ*= cos

^{−1}( · ).

*ε*is measured along an axis that is parallel to the world's up axis and perpendicular to the gaze vector. The vertical component of error

*α*is measured along axis , which is parallel to the ground plane but perpendicular to both the vertical axis and the gaze vector (see Figure 6). One calculates these vectors using a cross-product:

*edge*of a virtual object is complicated by transformations due to changes in viewpoint and perspective, which will change an object's projection onto the two-dimensional view plane. Because there is no clear method for extracting information regarding the pixels that define an object's edge once it has been projected onto the display screen, edge location must be approximated. For example, one might approximate all objects as spheres. Because spheres are viewpoint independent, changes in viewpoint will only bring about changes in visual size that may be easily be approximated using basic trigonometry. Subsequently, one can subtract the approximated angular size of the object from the measurements of angular distance provided above.

*g*) as the gaze vector changes across subsequent frames:

_{vel}*φ*), as well as the change in perspective (

*ς*) due to this slight translation. Consider a circular head of radius

*r*and a target that is distance

*d*from the circumference of the head (Figure 7). If the eye is directly aligned between the center of the head and the target during fixation (an initial visual angle of

*θ*), a head rotation of

*φ*results in an angular change of

*β*=

*φ*+

*ς*where

*ς*reflects the component of VOR necessary to compensate for the eye's translation due to head rotation. Given head rotation, head radius, and the distance between the head and target, this extra component is computed:

*ς*= tan

^{−1}[(

*r*sin

*φ*)/(

*r*+

*d*–

*r*cos

*φ*)]. In this equation, the bottom portion of the fraction reflects the contribution of eye translation (

*r*cos φ) to the magnitude of the vector between the eye and the target (

*r*+

*d*). The top portion of the equation reflects the perpendicular component of eye translation.

*d*is very small, values of

*ς*are larger, and contribute to an appreciable increase in the overall angular change required to maintain fixation. However, if

*d*is large relative to

*r*,

*ς*approaches zero and contributes little to the total angular change required for VOR.

*pursuit gain,*to approach unity. Although it becomes increasingly difficult to sustain unity gain at higher velocities, pursuit gain as high as 0.9 has been observed for targets moving up to 100°/s when the head is fixed (Meyer, Lasker, & Robinson, 1985), and at speeds of 184°/s when the head is unrestrained (Hayhoe, McKinney, Chajka, & Pelz, 2012).

*ω*) we measure the angular distance between the vector across frames of time. The cross-product between the vector recorded at two different points in time results in a new vector,

_{ft}*c*, orthogonal to both measured vectors, scaled by the sine of the angle between them. Normalizing this vector and then scaling by the angle gives the angular velocity in radians per frame around some axis in the world frame.

_{ft}**ω**= 0. However, if the angular velocity of the target around the eye is zero, then pursuit gain might not be an appropriate measure since pursuit implies a moving target. In this case we can still compute the change in absolute angular error between the gaze and the target over time although this is not the same as pursuit gain:

_{ft}*γ*−

_{t}*γ*

_{(t−1)}.

*Journal of Physiology**,*589 (Pt 7), 1627–1642. doi:10.1113/jphysiol.2010.199471. [CrossRef] [PubMed]

*Brain and Cognition**,*68 (3), 309–326. doi:10.1016/j.bandc.2008.08.020. [CrossRef] [PubMed]

*Journal of Vestibular Research Equilibrium Orientation**,*6 (6), 455–461. [CrossRef] [PubMed]

*, 13 (1): 20, 1–14, http://www.journalofvision.org/content/13/1/20, doi:10.1167/13.1.20. [PubMed] [Article] [CrossRef] [PubMed]*

*Journal of Vision*

*Proceedings of the Symposium on Eye Tracking Research & Applications - ETRA ‘02**,*103. doi:10.1145/507093.507094.

*Behavior Research Methods, Instruments, & Computers: A Journal of the Psychonomic Society, Inc**,*34 (4), 573–591. [CrossRef] [PubMed]

*Experimental Brain Research**,*217 (1), 125–136. doi:10.1007/s00221-011-2979-2. [CrossRef] [PubMed]

*Journal of the Optical Society of America**,*4, 629–642. [CrossRef]

*Vision Research**,*51 (10), 1173–1184. doi:10.1016/j.visres.2011.03.006. [CrossRef] [PubMed]

*, 25 (4), 561–563. [CrossRef] [PubMed]*

*Vision Research**, 42 (1), 188–204. doi:10.3758/BRM.42.1.188. [CrossRef] [PubMed]*

*Behavior Research Methods*

*Journal of Physiology**,*584 (Pt 1), 11–23. doi:10.1113/jphysiol.2007.139881. [CrossRef] [PubMed]

*(pp. 71–78). New York: ACM.*

*Proceedings of the 2000 Symposium on Eye Tracking Research & Applications*