Purchase this article with an account.
Michael Bonner, Russell Epstein; Computational mechanisms for identifying the navigational affordances of scenes in a deep convolutional neural network. Journal of Vision 2017;17(10):298. doi: https://doi.org/10.1167/17.10.298.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
A central component of spatial navigation is determining where one can and cannot go in the immediate environment. For example, in indoor environments, walls limit one's potential routes, while passageways facilitate movement. In a recent set of fMRI experiments, we found evidence suggesting that the human visual system solves this problem by automatically identifying the navigational affordances of the local scene. Specifically, we found that the occipital place area (OPA), a scene-selective region near the transverse occipital sulcus, appears to automatically encode the navigational layout of visual scenes, even when subjects are not engaged in a navigational task. Given the apparent automaticity of this process, we predicted that affordance identification could be rapidly achieved through a series of purely feedforward computations performed on retinal inputs. To test this prediction and to explore other computational properties of affordance identification, we examined the representational content in a deep convolutional neural network (CNN) that was trained on the Places database for scene categorization but has also been shown to contain information relating to the coarse spatial layout of scenes. Using representational similarity analysis (RSA), we found that the CNN contained information relating to both the neural responses of the OPA and the navigational affordances of scenes, most prominently in the mid-level layers of the CNN. We then performed a series of analyses to isolate the visual inputs that are critical for identifying navigational affordances in the CNN. These analyses revealed a strong reliance on visual features at high-spatial frequencies and cardinal orientations, both of which have previously been identified as low-level stimulus preferences of scene-selective visual cortex. Together, these findings demonstrate the feasibility of computing navigational affordances in a feedforward sweep through a hierarchical system, and they highlight the specific visual inputs on which these computations rely.
Meeting abstract presented at VSS 2017
This PDF is available to Subscribers Only