September 2017
Volume 17, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   August 2017
Spatial frequency tuning for indoor scene categorization
Author Affiliations
  • Verena Willenbockel
    Scene Grammar Lab, Department of Psychology, Goethe University Frankfurt, Germany
    Department of Psychology, University of Victoria, BC, Canada
  • Frédéric Gosselin
    Département de Psychologie, Université de Montréal, QC, Canada
  • Melissa Vo
    Scene Grammar Lab, Department of Psychology, Goethe University Frankfurt, Germany
Journal of Vision August 2017, Vol.17, 564. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Verena Willenbockel, Frédéric Gosselin, Melissa Vo; Spatial frequency tuning for indoor scene categorization. Journal of Vision 2017;17(10):564.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Indoor scenes typically contain a wealth of cues that are potentially useful for recognition, including characteristic global spatial properties and objects contained in the scene. Which of these cues do people actually use for accurate and quick scene categorization? Here we investigated this question in the spatial frequency (SF) domain. Using the SF Bubbles technique (Willenbockel et al., 2010), we examined which SFs are significantly correlated with observers' RTs in a scene categorization task with four categories — bathroom, bedroom, kitchen, and office. The base stimuli consisted of 800 gray-scale photographs (256 x 256 pixels) equated in luminance histograms and rated as typical exemplars of the respective category. They were SF filtered trial-by-trial using 20 randomly distributed Gaussian "bubbles" with a standard deviation of 1.8. The stimuli were presented in random order at a visual angle of 6 degrees and remained on the screen until the observer's response. Observers were instructed to press the space bar as soon as they recognized the scene category and, upon stimulus offset, press the respective key for the correct category. Feedback was provided after each trial. Mean accuracy across observers was 88.16% correct; mean RT was 486 ms. RTs did not differ significantly between categories. A multiple linear regression on the transformed RTs from correct trials and the respective SF filters revealed two SF bands significantly linked with fast responses: one around 3 cycles per image (cpi) and one around 28 cpi (octave width about 1). Interestingly, the latter SF band overlaps with the SFs found to be correlated with fast and accurate object recognition (Caplette et al., 2014). Our results suggest that people use a combination of the scene gist conveyed by low SFs and object information conveyed by a narrow band of relatively high SFs for effectively recognizing complex indoor scenes.

Meeting abstract presented at VSS 2017


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.