Abstract
Infant researchers are increasingly relying on computational models of bottom-up visual saliency, such as Graph-Based Visual Saliency (GBVS; Harel et al., 2006), to better understand the development of visual attention. GBVS predicts where an observer will look by decoding low-level features (color, intensity, orientation) of images and computing an overall map based on a weighting of these feature properties. The resulting maps are thought to reflect the distribution of the physical saliency of an image. However, GBVS was designed to predict adult fixations, and therefore, it is unclear the extent to which this model can reliably predict infant fixations or approximate infant visual saliency processing. We recorded eye gaze from 4- (N = 19), 6- (N = 21), and 10-month-old infants (N = 23) and adults (N = 24) as they viewed up to 48 naturalistic scenes from the MIT Saliency Benchmark Project (Judd et al., 2012). Correlations between each participant’s fixation density maps for each scene and GBVS saliency map were higher for adults compared to infants, indicating poorer GBVS performance for infant fixation data. Maps constructed for each of the individual channels (color, intensity, orientation) revealed that eye gaze was best predicted by orientation. Although GBVS performance did not increase over infancy, comparison of the GBVS-fixation density correlations to the noise ceiling (i.e., leave-one-out fixation density correlations) revealed that at 4 months, physical salience as measured by GBVS–and orientation in particular–accounted for nearly all explainable variation in eye gaze. However, the proportion of explainable variance accounted for by physical salience dramatically decreased across infancy. We suggest that young infants’ limited visual acuity and cortical development may result in qualitatively different processing of physical salience in naturalistic scene-viewing tasks compared to adults. Future work will explore how differences in scene properties (e.g., entropy/clutter) may relate to GBVS performance.