Abstract
Our group has previously shown that scene content can be predicted from eye movements observers make when viewing colour photographs. The time course of category predictions reveals differential contributions of bottom-up and top-down processes at different viewing times. Here, we use these known differences in order to determine when and to what extent image features at different representational levels contribute toward guiding gaze in a content-specific manner. 77 participants viewed grayscale photographs and line drawings of real-world scenes. In a leave-one-subject-out cross validation analysis, scene categories were predicted from gaze patterns over a 2-second time course. Scene categories could be predicted from gaze at all times in both photographs (average accuracy = 31.4%, chance = 16.7%, p < 0.0001) and line drawings (30.0%, p < 0.0001). We also replicate the time course, with an initial steep decrease in prediction accuracy from 300ms to 500ms, representing the contribution of bottom-up information, followed by a steady increase, representing top-down knowledge of category-specific information. Using DeepGaze II as the leading model of salience, we reconfirm a strong early contribution of bottom-up effects in grayscale photographs. We computed the low-level (luminance contrasts and orientation statistics) and mid-level features (local symmetry and contour junctions) from the images in order assess their differential contributions to content-specific guidance of gaze. For photographs, we find qualitatively similar contributions of these representational levels, contributing mostly to the initial bottom-up peak. For line drawings of the same scenes, we observe that mid-level features that describe scene structure (symmetry and junctions) play a more prominent role in the top-down guidance of gaze. Thus, we show that bottom-up information contributes less to gaze behaviour for line drawings than for photographs, and that structural features increasingly guide category-specific gaze when images are reduced to line drawings.
Meeting abstract presented at VSS 2018