Abstract
Previously acquired semantic and syntactic knowledge about scenes – so-called scene grammar – guides visual search and supports incidental encoding of objects. However, it is still unclear how scene grammar shapes our interactions with the environment and influences the resulting representations during natural behavior. To investigate this question, participants performed a repeated visual search task through a 3D virtual environment. They had to successively search for ten out of 20 possible target objects in ten realistic scenes. Crucially, half of the scenes were inverted to impede access to scene grammar guidance. Upright and inverted scenes were randomly interleaved. After searching, participants engaged in a surprise old/new object recognition task to assess incidental object memory. First results show that while participants searched descriptively longer in inverted scenes, learning between conditions did not differ. Using eye-tracking, we found that search initiation time was unaffected by scene inversion, but time to first target fixation and decision time were longer during search through inverted scenes. Importantly, time to first target fixation was affected by incidental gaze duration on the target object, but decision time was not. In the subsequent recognition task, we replicated previous findings observed in 2D according to which target objects were remembered substantially better than distractors. We found no main effect between search conditions, but an interaction effect. That is, targets searched in the upright condition were remembered better than targets of the inverted condition. Conversely, distractors were remembered better when they appeared in an inverted than in an upright scene. Moreover, decision time and incidental gaze durations on objects during search – but not search time – predicted memory performance in both conditions. Our findings demonstrate that during natural behavior scene grammar interacts with task relevance to guide search but results in a trade-off by affecting incidentally emerging object memories.