Abstract
Two approaches to video gaze tracking are compared: the “bottom-up” approach consists of making estimates based on measurements made on the two-dimensional image, such as the position in the image of the pupil centroid, corneal reflex, limbus, etc. This may be contrasted with a “top-down” approach in which the pose parameters of a model of the eye are adjusted in conjunction with a calibrated camera model to obtain a match to image data. One advantage of the model-based approach is provided by robustness to changes in geometry, as might occur due to slippage of a head-mount supporting a camera platform. A second advantage is that the pose estimates obtained are in units of degrees; calibration serves only to determine the relation between the visual and optical axes, and provide a check for the model. While traditional grid calibration methods may not need to be applied, a set of views of the eye in a variety of poses is needed to determine the model parameters for an individual (which may be an entire data set, if offline analysis of recorded images is performed). Errors introduced by computation-saving approximations are considered.
Supported by NASA's Airspace Systems and Aviation Safety programs, and the
FAA's Vertical Flight Human Factors program.