Abstract
Human spatiotemporal reasoning and problem-solving rest on the effortless encoding of perceptual and linguistic cues. The integration of cues across visual and linguistic domains is relatively understudied, and is particularly challenging because the two sources have such potentially different interpretations or presumed reliability. When making decisions in real time, how do we combine cues coming from linguistic and visual sources? Which cue is more important and how do we resolve potential conflicts? In a first study we asked participants to navigate though eight virtual reality mazes using the Oculus Rift Headset. A single maze comprised 30 T-intersections, each presenting a binary choice (go left or right). As participants approached each intersection, they were presented with either a visual cue (a red arrow) or an auditory cue (a voice saying "Go right" or "Go left"). Four mazes displayed only visual cues with varying levels of reliability (either 10%, 30%, 50% or 70% reliability, thus including cases where cues were "reversed"), while the other four mazes displayed only auditory cues with analogous levels of reliability. We recorded the proportion of trials on which cues were "trusted" (participants followed the indicated direction) under different conditions. Results show a higher level of trust for voice cues compared to arrow cues and a marked drop in trust at 10% reliability, while the other trust levels appear clustered together. A second study had a similar setup except that both visual and auditory cues were displayed, with either matching or different reliability levels (either 20%, 50% or 80% reliability). Again, subjects tended to trust linguistic cues more than visual ones, despite the objectively matched reliability levels. We also found a number of more subtle interactions between cue type and reliability learning, suggesting a complex integrative process underlying real-time decision-making.
Meeting abstract presented at VSS 2017