September 2011
Volume 11, Issue 11
Free
Vision Sciences Society Annual Meeting Abstract  |   September 2011
Translation Invariance of Natural Scene Categories
Author Affiliations
  • Marius Catalin Iordan
    Computer Science Department, Stanford University
  • Christopher Baldassano
    Computer Science Department, Stanford University
  • Dirk B. Walther
    Psychology Department, Ohio State University
  • Diane M. Beck
    Psychology Department, University of Illinois at Urbana-Champaign
    Beckman Institute, University of Illinois at Urbana-Champaign
  • Li Fei-Fei
    Computer Science Department, Stanford University
Journal of Vision September 2011, Vol.11, 816. doi:https://doi.org/10.1167/11.11.816
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Marius Catalin Iordan, Christopher Baldassano, Dirk B. Walther, Diane M. Beck, Li Fei-Fei; Translation Invariance of Natural Scene Categories. Journal of Vision 2011;11(11):816. https://doi.org/10.1167/11.11.816.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Natural scene categorization is fast, effortless, and invariant to many factors such as point of view, scale, and position in the visual field. Previous work has shown that scene category can be decoded from fMRI voxel activations in areas as early as V1, but more robustly in higher areas in the ventral visual stream, such as PPA, LOC, and RSC (Walther et al., 2009). A natural and fundamental question arises about this process: Where along the stream does the scene representation become invariant to geometric transformations? We address this question for the specific case of translation of natural scene images within the visual field. We conducted an fMRI experiment in which subjects passively viewed images of natural scenes presented either to their left or right visual fields (“Left” and “Right” conditions, respectively). Using the voxel activations obtained during the experiment, we trained SVM classifiers to predict the category of the natural scene viewed by the subject. The training and testing sets each belonged to one of the two conditions above, for a total of four classification conditions (train on “Left” and test on “Left”, train on “Left” and test on “Right”, etc.). In retinotopic areas (V1 through V4) it was significantly easier to classify images when training and testing on scenes presented to the same part of the visual field, as opposed to opposite sides of the visual field. In PPA, however, we showed comparable classification accuracies whether we trained and tested on images from the same visual field or trained and tested on images from the opposite visual fields (translation invariance). This distinction suggests that, just as LOC encodes geometrically invariant object representations, higher areas in the ventral stream (specifically PPA) abstract away geometric variance in encoding natural scenes.

Stanford SGF (MCI) NSF GRFP (CB) NIH R01 EY019429 (DBW & DMB & LF-F). 
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×