September 2019
Volume 19, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2019
The role of recurrent processing in visual scene categorization
Author Affiliations & Notes
  • Jamie L Siegart
    Program in Neu-roscience, Bates College
  • Wuyue Zhou
    Program in Neu-roscience, Bates College
  • Enton Lam
    Program in Neu-roscience, Bates College
  • Munashe Machoko
    Program in Neu-roscience, Bates College
  • Michelle R Greene
    Program in Neu-roscience, Bates College
Journal of Vision September 2019, Vol.19, 129b. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Jamie L Siegart, Wuyue Zhou, Enton Lam, Munashe Machoko, Michelle R Greene; The role of recurrent processing in visual scene categorization. Journal of Vision 2019;19(10):129b. doi:

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

The core of visual scene understanding can be accomplished with a single fixation (Fei-Fei et al., 2007; Greene & Oliva, 2009). This speed has implied that core visual processing can be accomplished in a feedforward manner (Serre et al, 2007). Deep convolutional neural networks (dCNNs), themselves exclusively feedforward, have achieved categorization abilities rivaling those of human observers (Russakovsky et al., 2015). However, these networks fail to explain critical neural and behavioral responses (Geirhos et al., 2017), and cannot explain the massive feedback connections in primate visual systems (Felleman & Van Essen, 1991). Work in monkey physiology has demonstrated that images that are easy for humans to recognize, but difficult for dCNNs take longer to decode from IT cortex (Kar et al., 2018), suggesting a role for recurrent processing for difficult images. However, these images are not meaningful to the monkeys and consist of photoshopped objects on arbitrary backgrounds. To what extent does this pattern hold for humans viewing familiar images? We identified 25 scene categories from the SUN database whose images led to a range of categorization accuracies in three dCNNs. We isolated 20 images with high- and 20 images with low- dCNN performance per category. Observers viewed these images in random order while performing 2AFC categorization. We recorded 64-channel EEG and submitted the raw waveforms to a linear decoder to assess category information. Observers were on average 50 ms faster to categorize images that were easy for the dCNNs. Although both easy and hard images could be decoded at an above-chance level, decodable information for the easy images was available by 57 ms post-image onset and peaked after 85 ms, but information about hard images was not available until 72 ms, peaking at 192 ms. Together, this pattern is suggestive of the role of recurrent processing in human scene categorization.

Acknowledgement: National Science Foundation (1736274) grant to MRG 

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.