December 2022
Volume 22, Issue 14
Open Access
Vision Sciences Society Annual Meeting Abstract  |   December 2022
Properties of V1 and MT motion tuning emerge from unsupervised predictive learning
Author Affiliations & Notes
  • Katherine Storrs
    Department of Experimental Psychology, Justus Liebig University Giessen, Germany
  • Onno Kampman
    Department of Psychology, University of Cambridge, UK
  • Reuben Rideaux
    Queensland Brain Institute, University of Queensland, Australia
  • Guido Maiello
    Department of Experimental Psychology, Justus Liebig University Giessen, Germany
  • Roland Fleming
    Department of Experimental Psychology, Justus Liebig University Giessen, Germany
  • Footnotes
    Acknowledgements  Supported by the HMWK cluster project “The Adaptive Mind”; the DFG (SFB-TRR-135 #222641018); the ERC (ERC-2015-CoG-682859: “SHAPE”); Marie Skłodowska-Curie Actions (H2020-MSCA-ITN-2017:#765121 and H2020-MSCA-IF- 2017: #793660), the ARC (DE210100790) and an Alexander von Humboldt fellowship.
Journal of Vision December 2022, Vol.22, 4415. doi:https://doi.org/10.1167/jov.22.14.4415
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Katherine Storrs, Onno Kampman, Reuben Rideaux, Guido Maiello, Roland Fleming; Properties of V1 and MT motion tuning emerge from unsupervised predictive learning. Journal of Vision 2022;22(14):4415. https://doi.org/10.1167/jov.22.14.4415.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Our ability to perceive motion arises from a hierarchy of motion-tuned cells in visual cortices. Signatures of V1 and MT motion tuning emerge in artificial neural networks trained to report speed and direction of sliding images (Rideaux & Welchman, 2020). However, the brain’s motion code must develop without access to such ground truth information. Here we tested whether a more realistic learning objective—unsupervised learning by predicting future observations—also yields motion processing that resembles physiology. We trained a two-layer recurrent convolutional network based on predictive coding principles (PredNet; Lotter, Kreiman & Cox, 2016) to predict the next frame in videos. Training stimuli were 64,000 six-frame videos depicting natural image fragments sliding with uniformly-sampled random velocity and direction. The network’s learning objective was to minimise mean absolute pixel error between its prediction and the actual next frame. Despite receiving no explicit information about direction or velocity, we found that almost all units in both layers of the network developed tuning to a specific motion direction and velocity, when probed with sliding sinusoidal gratings. The network also recapitulated population-level properties of motion tuning in V1. In both layers, mean activation across the population of units showed a motion direction anisotropy, peaking at 90 and 270 degrees (vertical motion), likely due to static orientation statistics of natural images. Like MT neurons, units in the network appeared to solve the “aperture problem”. When probed using pairs of orthogonally-drifting gratings superimposed to create plaid patterns, almost all units were tuned to the direction of the whole pattern, rather than its individual components. Unsupervised predictive learning creates neural-like single-unit tuning, population tuning statistics, and integration of locally-ambiguous motion signals, and provides an interrogable model of why motion computations take the form they do.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×