September 2017
Volume 17, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   August 2017
Computing Saliency over Proto-Objects Predicts Fixations During Scene Viewing
Author Affiliations
  • Yupei Chen
    Department of Psychology, Stony Brook University
  • Gregory Zelinsky
    Department of Psychology, Stony Brook University
    Department of Computer Science, Stony Brook University
Journal of Vision August 2017, Vol.17, 209. doi:10.1167/17.10.209
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to Subscribers Only
      Sign In or Create an Account ×
    • Get Citation

      Yupei Chen, Gregory Zelinsky; Computing Saliency over Proto-Objects Predicts Fixations During Scene Viewing. Journal of Vision 2017;17(10):209. doi: 10.1167/17.10.209.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Most models of fixation prediction operate at the feature level, best exemplified by the Itti-Koch (I-K) saliency model. Others suggest that objects are more important (Einhäuser et al., 2008), but defining objects requires human annotation. We propose a computationally-explicit middle ground by predicting fixations using a combination of saliency and mid-level representations of shape known as proto-objects (POs). For 384 real-world scenes we computed an I-K saliency map and a proto-object segmentation, the latter using the model from Yu et al. (2014). We then averaged the saliency values internal to each PO to obtain a salience for each PO segment. The maximally-salient PO determined the next fixation, with the specific x,y position being the saliency-weighted centroid of the PO's shape. To generate sequences of saccades we inhibited fixated locations in the saliency map, as in the I-K model. We found that this PO-saliency model outperformed (p < .001) the I-K saliency model in predicting fixation-density maps obtained from 12 participants freely viewing the same 384 scenes (3 seconds each). Comparison to the GBVS saliency model showed a similarly significant benefit. Over five levels we also manipulated the coarseness of the PO segmentations for each scene on a fixation-by-fixation basis, meaning that the first predicted fixation was based on the coarsest segmentation and the fifth predicted fixation was based on the finest. Doing this revealed considerable improvements relative to the other tested saliency models, largely due to the capture of a relationship between center bias and ordinal fixation position. Rather than being an ad hoc addition to a saliency model, a center bias falls out of our model via its coarse-to-fine segmentation of a scene over time (fixations). We conclude that fixations are best modeled at the level of proto-objects, which combines the benefit of objects with the computability of features.

Meeting abstract presented at VSS 2017

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×