Abstract
Models of eye movements during scene viewing attempt to explain distributions of fixations and do not typically address the neural basis of saccade programming. Models of saccade programming are tied more closely to neural mechanisms, but have not been applied to scenes due to limitations on the inputs that they can accept—typically just dots. This work bridges the gap between these literatures by adding an image-based “front end” to a model of saccade programming in the superior colliculus (SC). An image of a scene is first blurred to reflect acuity limitations existing at the current fixation, and a saliency map is computed from this fixation-blurred image. This saliency map (in pixel coordinates) is projected onto the collicular surface (in mm coordinates), where the activity of each SC neuron is modeled as a Gaussian-weighted sum of its inputs. A local population average is then computed from this activity relative to each SC neuron, with the most active of these local populations determining the next saccade landing position following inverse projection back to visual space. Scanpaths are generated by injecting inhibition on the saliency map at each new fixation location and allowing the above process to repeat. We tested the model on a 30-image subset of the MIT saliency dataset (1000 scenes, 15 participants performing a free-viewing task), selected to maximize inter-subject agreement in first saccade landing positions. Our model predicted the first, second and third saccade landing positions significantly better than either the adaptive-whitening model or the Itti-Koch model, which both predict fixations on peaks in a saliency map. Subsequent saccades could not be evaluated due to lack of agreement in the behavioral data; all model predictions dropped to chance. These results suggest that neurophysiological constraints from saccade programming should be incorporated into image-based models of fixation selection during scene viewing.
Meeting abstract presented at VSS 2015