December 2022
Volume 22, Issue 14
Open Access
Vision Sciences Society Annual Meeting Abstract  |   December 2022
Characterizing internal models for scene vision
Author Affiliations & Notes
  • Daniel Kaiser
    Mathematical Institute, Justus-Liebig-University Giessen
    Center for Mind, Brain and Behavior, Justus-Liebig-University Giessen and Philipps-University Marburg
  • Matthew Foxwell
    Institute of Psychology, University of York
  • Footnotes
    Acknowledgements  D.K. is supported by the German Research Foundation (DFG), grant KA4683/2-1.
Journal of Vision December 2022, Vol.22, 4119. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Daniel Kaiser, Matthew Foxwell; Characterizing internal models for scene vision. Journal of Vision 2022;22(14):4119.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

The visual brain is often conceptualized as a predictive system. Under this view, visual inputs are constantly matched against internal models of what the world should look like, with a higher similarity between the input and model leading to increased processing efficiency. Given our many prior expectations about the structure of everyday environments, such predictions should be very potent in natural scene perception. In a series of experiments, we asked whether perceptual performance is explained by how well scenes are matched to participants' personal internal models of natural scene categories. Participants took part in drawing tasks where they sketched their most typical versions of kitchens and lounges, which we used as descriptors for their internal models. These drawings were then converted into 3d renders. Using these renders in a scene categorization task, we observed better categorization for renders based on participants' own drawings compared to renders based on others' drawings and renders based on arbitrary scene photographs. Further, using a deep neural network (DNN) trained on scene categorization, we investigated whether graded similarity to participants’ own drawings predicted categorization performance. We found that behavioral categorization was better when the DNN's response to a scene was more similar to the DNN's response to the participant’s typical scene of the same category – and more dissimilar to the response to the typical scene of the other category. This effect was specifically observed at late DNN layers, suggesting that perceptual efficiency is determined by high-level visual similarity to the internal model. Together, these results show that perception operates more efficiently in environments that adhere to our internal models of the world. They further highlight that for making progress in understanding natural vision, we need to account for idiosyncrasies in personal priors.


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.