September 2018
Volume 18, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2018
Title: Convolutional Network Approach to Modelling Allocentric Landmark Impact on Target Localization
Author Affiliations
  • Sohrab Salimian
    Center for Vision Research: Vision Science to Applications Program (VISTA)Department of Biology York University, Toronto, ON, Canada
  • Richard Wildes
    Center for Vision Research: Vision Science to Applications Program (VISTA)Center for Vision Research, Department of Electrical Engineering and Computer Science, York University, Toronto, ON, Canada
  • John Crawford
    Center for Vision Research: Vision Science to Applications Program (VISTA)Departments of Psychology, Biology, Kinesiology and Health Science, York University, Toronto, ON, Canada
Journal of Vision September 2018, Vol.18, 206. doi:10.1167/18.10.206
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Sohrab Salimian, Richard Wildes, John Crawford; Title: Convolutional Network Approach to Modelling Allocentric Landmark Impact on Target Localization. Journal of Vision 2018;18(10):206. doi: 10.1167/18.10.206.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

A critical question in visual processing is the degree to which egocentric and allocentric reference frames are utilized during target localization. For example Li et al (2017) tested their contributions using the cue conflict task on macaque monkeys, where the monkeys were presented with a target and an allocentric landmark. The landmark was then masked and shifted (or not shifted). During the shift paradigm the monkeys' final gaze position was siginificantly shifted towards the virtually shifted location of the target in allocentric coordinates. In the current work we attempted to model these results by utilizing a convolutional network (ConvNet) with a spatial transformer module. This model inputs a binary image containing a target localized at a particular spatial location as well as an allocentric landmark represented as the intersection of vertical and horizontal lines. It outputs a vector anchored at the (0,0) position on the image matrix, corresponding to the position on the array where the target has been calculated to lie in. The network achieves this through multilayer processing that begins by estimating and applying an affine transformation that accounts for differences in the target vs landmark coordinates, followed by convolution and regression for target localization. The affine transformation is learned through the spatial transformer which takes the image and applies the reverse of the transformations and then feeds the output to the convolutional and regressional layers (Jaderberg et al 2015). The model's outputs is in agreement with the findings in Li et al (2017): As the landmark is shifted away from the target, the network's choice is also shifted away from the target position. Future work will look to increase robustness in terms of target localization with respect to mutliple allocentric landmarks and to modify the model's architecture to include hand-crafted components to increase precision.

Meeting abstract presented at VSS 2018

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×