Abstract
As we look around an environment, we actively select which semantic information to attend and which to ignore. How systematic are these individual differences? In other words, does an individual’s pattern of semantic attention in one environment reliably and uniquely predict their attention in a new environment? Here, we tested whether “attentional fingerprints” exist in naturalistic visual behavior. Participants’ (n = 16) gaze was monitored while they actively explored real-world photospheres (n = 60) in VR. To model scene semantics, we introduced a novel approach combining human judgments and computational language modeling to capture affordance-based inferences available to first-person viewers. Specifically, we decomposed each photosphere into tiles and obtained a written description of each tile (MTurk participants) containing both label (a door) and affordance-based (could be opened) content. Each description was transformed using a context-sensitive NLP model (BERT) into a sentence-level semantic embedding. For each participant, we used a mixed regression model built on n-1 trials (gaze~semantics) to iteratively predict gaze in the left-out trial. We correlated participants’ predicted and actual gaze and tested whether within-subject correlation based on a participant’s own semantic model was higher, on average, than predictions made by all other participants’ semantic models (own-other difference, OOD). We find that within-subject models accurately predict gaze on left-out photospheres (r = 0.33, p < 0.001); crucially, within-subject models are also individuating (OOD scores, p < 0.001). Interestingly, our ability to individuate gaze does not simply rely on modeling large numbers of semantic labels. We find that OOD is greater when verbal descriptions contain label + affordance descriptions, relative to label-only descriptions (p = 0.039). Together, our results reveal “attentional fingerprints” in real-world visual behavior and highlight the potential for inferring individual differences in higher-order cognitive processes (action planning, inferential reasoning) and psychiatric traits (autism, anxiety) from gaze alone.