August 2016
Volume 16, Issue 12
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2016
A recurrent convolutional neural network model for visual feature integration in memory and across saccades
Author Affiliations
  • Yalda Mohsenzadeh
    Centre for Vision Research, York University, Toronto, Ontario, Canada
  • J. Douglas Crawford
    Centre for Vision Research, York University, Toronto, Ontario, Canada
Journal of Vision September 2016, Vol.16, 101. doi:https://doi.org/10.1167/16.12.101
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Yalda Mohsenzadeh, J. Douglas Crawford; A recurrent convolutional neural network model for visual feature integration in memory and across saccades. Journal of Vision 2016;16(12):101. https://doi.org/10.1167/16.12.101.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

The stability of visual perception, despite ongoing shifts of retinal images with every saccade, raises the question of how the brain overcomes temporally and spatially separated visual inputs to provide a unified, continuous representation of the world through time. The brain could solve this challenge by retaining, updating and integrating the visual feature information across saccades. However, at this time there is no one model that accounts for this process at the computational and/or algorithmic level. Previously, feedforward convolutional neural network (CNN) models, inspired by hierarchical structure and visual processing in the ventral stream, have shown promising performance in object recognition (Bengio 2013). Here, we present a recurrent CNN to model the spatiotemporal mechanism of feature integration across saccades. Our network includes 5 layers: an input layer that receives a sequence of gaze-centered images, a recurrent layer of neurons with V1-like receptive fields (feature memory) followed by a pooled layer of the feature maps which reduces the spatial dependency of the feature information (similar to higher levels in the ventral stream), a convolutional map layer which is fully connected to an output layer that performs a categorization task. The network is trained on a memory feature integration task for categorization of integrated feature information collected at different time points. Once trained, the model showed how the feature representations are retained in the feature memory layer during a memory period and integrated with the new entering features. The next step is to incorporate internal eye movement information (intended eye displacement, eye velocity and position) in the model to see the effect of intended eye movements on updating of the feature maps. Our preliminary results suggest that recurrent CNNs provide a promising model of human visual feature integration and may explain the spatiotemporal aspects of this phenomenon across both fixations and saccades.

Meeting abstract presented at VSS 2016

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×