October 2020
Volume 20, Issue 11
Open Access
Vision Sciences Society Annual Meeting Abstract  |   October 2020
Comparison of a reinforcement-learning and a biologically-motivated representation of 3D space
Author Affiliations & Notes
  • Andrew Glennerster
    University of Reading
  • Alexander Muryy
    University of Reading
  • Footnotes
    Acknowledgements  Funded by EPSRC/Dstl EP/N019423/1
Journal of Vision October 2020, Vol.20, 384. doi:https://doi.org/10.1167/jov.20.11.384
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Andrew Glennerster, Alexander Muryy; Comparison of a reinforcement-learning and a biologically-motivated representation of 3D space. Journal of Vision 2020;20(11):384. doi: https://doi.org/10.1167/jov.20.11.384.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Recent advances in reinforcement learning demonstrate that navigation and prediction of novel views do not require the agent to have a 3D model of the scene. Here, we examine a reinforcement learning method that rewards an agent for arriving at a target image but does not generate a 3D 'map'. We compare this to a biologically motivated alternative that also avoids a 3D reconstruction; it is a hand-crafted representation based on relative visual directions (RVD) which has, by design, a high degree of geometric consistency. We tested the ability of both types of representation to support geometric tasks such as interpolating between learned locations. In both cases, interpolation is possible if two stored feature vectors in the network – each associated with a given location - are averaged and the mean vector is decoded to recover a mean location. The performance is much more variable for the reinforcement learning model than for the RVD model (about seven times greater standard deviation). We show the same result for interpolation of camera orientation. A tSNE projection of the stored vectors (into 2D) for each type of representation illustrates why performance of the two models should be different on these tasks. In the RVD model, the tSNE projection shows a regular pattern reflecting the geometric layout of the learned locations in space whereas, for the reinforcement learning model, the clustering of stored vectors reflects other factors such as the agent’s goals during training. Our comparison of these two models demonstrates that it is advantageous to include information about the persistence of features as the camera translates (e.g. distant features persist). It is likely that representations of this sort, storing high-dimensional state vectors instead of 3D coordinates, will be increasingly important in the search for robust models of human spatial perception and navigation.


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.