Abstract
Eye movements and visual attention are guided by expectations on semantic informativeness, such as relevance to scene category and to constituent objects in the scene. Expectations are generated based on implicit ‘scene grammar’ rules on what and where objects are likely to appear in a given scene. Given the pivotal role of semantic expectations, it is therefore unsurprising that unexpected elements like semantic violations have been found to be more difficult to recognize and to require more cognitive effort to process at fixation. Yet, little is known about the effects of semantic violations on subsequent viewing behavior. Here, we explored whether encountering a semantically inconsistent object has persistent effects on oculomotor programming and semantic guidance on the first three fixations post-exit from the object. Eye-tracking data were collected from 102 participants viewing 62 scenes within the SCEGRAM image database (Öhlschläger & Võ, 2017), each featuring either a consistent or inconsistent critical object insertion. We replicated previous findings and showed reliably elevated fixation rates and dwell times on the inconsistent objects compared to the consistent objects. However, post-exit eye movement characteristics in fixation durations and saccade amplitudes did not show any significant differences. Furthermore, inconsistent object semantics did not significantly influence subsequent attentional guidance. Encountering an inconsistent object neither pushed post-exit visual attention towards regions more conceptually similar to the object nor interfered with guidance from other sources of semantic scene information. Overall, the results showed that disruptions from semantic violations did not extend beyond their object borders and that inconsistent semantics were ignored or suppressed when deciding where next to attend. The current work is the first to study attention following fixation on a semantic violation and advances our understanding on how the visual system handles and adapts to unexpected elements in realistic visual environments.