September 2018
Volume 18, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2018
Convolutional recurrent neural network models of dynamics in higher visual cortex
Author Affiliations
  • Aran Nayebi
    Stanford University
  • Jonas Kubilius
    Massachusetts Institute of Technology
  • Daniel Bear
    Stanford University
  • Surya Ganguli
    Stanford University
  • James DiCarlo
    Massachusetts Institute of Technology
  • Daniel Yamins
    Stanford University
Journal of Vision September 2018, Vol.18, 717. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Aran Nayebi, Jonas Kubilius, Daniel Bear, Surya Ganguli, James DiCarlo, Daniel Yamins; Convolutional recurrent neural network models of dynamics in higher visual cortex. Journal of Vision 2018;18(10):717. doi:

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Neurons in the ventral visual pathway exhibit behaviorally relevant temporal dynamics during image viewing. However, the most accurate existing computational models of this system are feedforward hierarchical convolutional neural networks (HCNNs), which capture neurons' time-averaged responses, but do not account well for their complex temporal trajectories. Here we show that HCNNs augmented with both local and global recurrent connections are quantitatively accurate models of dynamics in higher visual cortex. We began with a five-layer HCNN that achieved state-of-the-art predictions of temporally-averaged visual responses in macaque V4 and IT neurons. To model within-area dynamics, we replaced units in each layer with one of several local recurrent circuit motifs, including simple Recurrent Neural Networks (RNNs), Gated Recurrent Units (GRUs), and Long Short-Term Memory (LSTM) units. We also included combinations of global feedback connections, in which outputs of later convolutional layers were added to inputs of earlier layers. Using backpropagation through time, these new parameters were optimized to predict V4 and IT neural response patterns. Finally, we tested these networks' ability to predict responses on held-out images and neurons not used for model optimization. We found that the best network structure led to substantial improvements over the feedforward baseline, explaining close to 100% of the explainable variance in V4 neurons and above 75% in IT neurons on average across time points. This network made use of gated local recurrence, with LSTMs and GRUs proving superior to simple RNNs. Furthermore, the presence of specific global feedback connections in this network was critical for best predicting V4 neuron dynamics. In summary, we have developed a deep recurrent neural network architecture that accurately captures temporal dynamics in several ventral cortical areas, opening the door to more detailed computational study of the circuit structures underlying complex visual behaviors.

Meeting abstract presented at VSS 2018


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.