Abstract
Rapid visual presentation paradigms, characterized by short presentation times and speeded behavioral responses, have played a central role in the study of our core visual recognition ability. Here, we adapted the rapid presentation paradigm for a visual navigation task to study how pre-attentive vision controls behavior. Method. We used CryEngine, a state-of-the-art gaming engine, to synthesize videos simulating self-motion through naturalistic scenes from a first-person view at a human walking speed (~1.4m/sec). Modern gaming engines offer the possibility to create relatively controlled, yet realistic, visual environments. Participants were presented with these 300ms long masked video sequences and used the mouse to report their preferred best direction to steer in order to navigate through the scene while avoiding obstacles. The best steering direction was selected from a continuum indicated by arrows along a circle projected on the ground. Participants were instructed to answer as accurately and as fast as possible to limit cortical feedback. Results. We computed a navigability index for each video by computing the proportion of responses from all participants in each steering direction (18 bins of 10 deg azimuth). Overall we found a high degree of consistency across participants, suggesting that they understood the task and relied on similar scene information. Behavioral data were consistent with steering directions predicted by a model for the behavioral dynamics of steering (Fajen & Warren, 2003) based on ground-truth depth data. We further used the behavioral responses to evaluate a variety of visual cues, from motion and shape to saliency, and to learn to predict participants' steering directions. Overall, the relative success of the proposed approach suggests that it may be possible to learn visual strategies (e.g. visual equalization, saccade and clutter response, etc.) directly from data without explicitly implementing them.
Meeting abstract presented at VSS 2014