August 2016
Volume 16, Issue 12
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2016
Adding Shape to Saliency: A Proto-object Saliency Map for Predicting Fixations during Scene Viewing
Author Affiliations
  • Yupei Chen
    Department of Psychology
  • Chen-Ping Yu
    Department of Computer Science
  • Gregory Zelinsky
    Department of Psychology
Journal of Vision September 2016, Vol.16, 1309. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Yupei Chen, Chen-Ping Yu, Gregory Zelinsky; Adding Shape to Saliency: A Proto-object Saliency Map for Predicting Fixations during Scene Viewing. Journal of Vision 2016;16(12):1309. doi:

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Traditional saliency models predict fixations during scene viewing by computing local contrast between low-level color, intensity and orientation features; the higher the summed contrast the greater the probability of fixation. Evidence also suggests that high-level properties of objects are predictive of fixation locations. We attempt fixation prediction from proto-objects (POs), a mid-level representation existing between features and objects. Using our previously-reported proto-object model (Yu et. al., 2014, JoV), we segmented 384 images of real-world scenes into proto-objects, fragments of visual space, at multiple resolutions (feature-space bandwidths). We then built from these segmentations a saliency map by computing feature contrast between each proto-object and its local neighbors using intensity, color, orientation, and now, size and shape features. Center-surround size contrast was computed by comparing pixel area between a given proto-object and each "surrounding" neighbor. To compute shape contrast we first normalized a proto-object and a neighbor to have the same area, aligned them based on maximum area-overlap, then counted the number of pixels in the overlapping area (divided by the union of the areas), with a smaller overlap over neighbors coding a higher contrast. Doing this relative to each proto-object, then combining contrast signals across features and resolutions, generates a proto-object saliency map, which we used to predict the fixation behavior of 12 participants freely viewing the same scenes (each for 3 seconds) in anticipation of a memory test. We found that our proto-object saliency map predicted fixations as well or better than an Itti-Koch saliency model, and was nearly as predictive as the upper-limit defined by a Subject model obtained using the leave-one-out method. We conclude that size and shape features, quantified in terms of proto-objects, are used to guide overt visual attention, and that saliency-based models of fixation prediction need to recognize the importance of these mid-level visual features.

Meeting abstract presented at VSS 2016


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.