October 2020
Volume 20, Issue 11
Open Access
Vision Sciences Society Annual Meeting Abstract  |   October 2020
Pooling model of tilt estimation based on surface tilt statistics in natural scenes
Author Affiliations & Notes
  • Seha Kim
    University of Pennsylvania
  • Johannes Burge
    University of Pennsylvania
  • Footnotes
    Acknowledgements  This work was supported by NIH grant R01-EY028571 from the National Eye Institute & Office of Behavioral and Social Science Research and NIH grant R01-EY011747 from the National Eye Institute.
Journal of Vision October 2020, Vol.20, 1239. doi:https://doi.org/10.1167/jov.20.11.1239
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Seha Kim, Johannes Burge; Pooling model of tilt estimation based on surface tilt statistics in natural scenes. Journal of Vision 2020;20(11):1239. https://doi.org/10.1167/jov.20.11.1239.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Visual systems estimates three-dimensional (3D) structure of the environment from two-dimensional (2D) retinal images. To improve accuracy, visual systems use multiple sources of information. Here, we examine how human visual systems use prior information about the world to improve the estimation of 3D surface tilt. We analyzed the statistics of 3D tilts in natural scenes from a large stereo-image database with co-registered distance information at each pixel. We found a systematic pattern governing how tilts are spatially related in natural scenes. We designed a hierarchical model that pools local tilt estimates in accordance with these scene statistics. The model first computes a Bayes-optimal local estimate given three image cues (i.e. luminance, texture, and disparity). The model then computes a “global” estimate by pooling the local estimates within a neighborhood centered on the target location. The orientation and aspect ratio of each pooling neighborhood was dictated by the natural scene statistics. The model was evaluated how accurately it estimated groundtruth tilt in natural scenes and how accurately it predicted human performance. Human performance was determined in a psychophysical experiment. Humans viewed natural scenes through a stereoscopically defined circular aperture that was 3deg in diameter. The task was to estimate the surface tilt at the center of the patch via a mouse-controlled probe. Four human observers participated in two experiments; each experiment contained 3600 unique stimuli. We found that the global model provides more accurate estimates of groundtruth tilt and better predictions of human performance than the local model. We also found that the pooling neighborhood areas that maximized estimation accuracy were very similar to the pooling neighborhood areas that best predict human performance. Taken together, the results suggest that human visual systems integrate local estimates in accordance with statistics of surface tilt natural scenes.


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.