Purchase this article with an account.
Marius Catalin Iordan, Christopher Baldassano, Dirk B. Walther, Diane M. Beck, Li Fei-Fei; Translation Invariance of Natural Scene Categories. Journal of Vision 2011;11(11):816. doi: 10.1167/11.11.816.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
Natural scene categorization is fast, effortless, and invariant to many factors such as point of view, scale, and position in the visual field. Previous work has shown that scene category can be decoded from fMRI voxel activations in areas as early as V1, but more robustly in higher areas in the ventral visual stream, such as PPA, LOC, and RSC (Walther et al., 2009). A natural and fundamental question arises about this process: Where along the stream does the scene representation become invariant to geometric transformations? We address this question for the specific case of translation of natural scene images within the visual field. We conducted an fMRI experiment in which subjects passively viewed images of natural scenes presented either to their left or right visual fields (“Left” and “Right” conditions, respectively). Using the voxel activations obtained during the experiment, we trained SVM classifiers to predict the category of the natural scene viewed by the subject. The training and testing sets each belonged to one of the two conditions above, for a total of four classification conditions (train on “Left” and test on “Left”, train on “Left” and test on “Right”, etc.). In retinotopic areas (V1 through V4) it was significantly easier to classify images when training and testing on scenes presented to the same part of the visual field, as opposed to opposite sides of the visual field. In PPA, however, we showed comparable classification accuracies whether we trained and tested on images from the same visual field or trained and tested on images from the opposite visual fields (translation invariance). This distinction suggests that, just as LOC encodes geometrically invariant object representations, higher areas in the ventral stream (specifically PPA) abstract away geometric variance in encoding natural scenes.
This PDF is available to Subscribers Only