Abstract
Returning to a previously visited location (‘home’) could be done by image matching or by 3D reconstruction of the scene. We have shown that participants’ errors are better predicted by image-matching but here we restrict participants’ views to prevent them using this strategy.
In the learning phase, participants in immersive virtual reality viewed a naturalistic indoor scene from one zone (binocular vision and limited head movements) with a restricted field of view (90 degree cone) and only one viewing direction permitted (e.g. North). After participants became familiar with the view, the cyclopean point was briefly frozen with respect to the scene (definition of ‘home’). Participants were then teleported to another location and had to return to ‘home’ (search phase). Again, the FOV was restricted, but the direction could be 0, 90 or 180 degrees different from the learning phase. The learning-phase view was always towards the centre of the room and participants had a sufficient view of objects in both the learning and search phases to ensure that the task was always possible.
Participants’ errors (RMSE of reported location relative to ‘home’) increased as a function of the angle between the learning and search phase viewing directions. When the search phase orientation differed by 90 or 180 degrees, the reported location was systematically shifted in the direction of the view in the search phase (GROUP: p < .0001).
The fact that participants are able to return relatively close to ‘home’ rules out (by design) the hypothesis that they are using an image-matching strategy to solve the task. On the other hand, a 3D reconstruction hypothesis does not predict these systematic biases. Any image-based strategy that could explain these data would need to rely on something like the latent space interpolation that has been so successful in generative adversarial networks (GANs).