Abstract
Neural correlates of true, real world vision are almost entirely unknown. The lack of understanding of real world vision is particularly problematic in the context of face perception, where passively viewing static, unfamiliar, and isolated faces briefly presented at fixation bears little resemblance to the richness of real world interpersonal interactions. In the real world there is context, familiar faces in relatively stable positions, and volitional eye movements are used to actively sample information. To begin filling this gap in knowledge, we simultaneously recorded intracranial electroencephalography (ECoG), eye-tracking and videos of scenes being viewed by human subjects over hours of natural conversations with friends, family and experimenters. Annotating these videos on a frame-by-frame basis using computer vision models, we delineated each fixation as being on a face or other object during natural behavior. Multivariate classification revealed that spatiotemporal signatures of activity in each subject were sensitive to whether they were looking at faces or objects. Notably, a far greater portion of cortex was involved in face processing during real world vision than in a traditional experimental paradigm. Additionally, neural activity during object fixations could be used to classify whether a face was present elsewhere in the visual field or not, demonstrating contextual modulation of spatiotemporal patterns of neural activity. The neurodynamics of eye movement guidance were then examined by showing that what patients were going to look at next could be classified. Specifically, not only did brain activity predict where in space subjects would saccade, but also whether or not the next saccade would be to a face. These findings demonstrate that richness of real world visual perception is captured from the neurodynamics of visual perception, highlighting the power of invasive neural recordings in humans in combination with real world behavior as a platform for studying visual neuroscience.