Abstract
Introduction. In 1944, Heider and Simmel showed that humans spontaneously generate social interpretations when viewing sparse animations with moving shapes. Further studies have followed, investigating how motion trajectories are recognized as human actions (Roemmele et al. 2016), how cuing can elicit social meaning in simple animations (Tavares et al. 2008), and how motion within a context can drive attributions of beliefs (Baker et al. 2017). Here, we combine and compare attributions of actions, intentions, emotions, and beliefs elicited by animations, as this intersectional approach has been understudied. Specifically, we transform humans’ descriptions of Heider-Simmel like animations to a semantic space, where we can then examine representational structures that underlie perceptual and cognitive processes. Methods. Each participant viewed two subsets of 100 animations, while labeling the gist of each animation with single keywords in the first phase, and choosing from a predefined list of labels in the second phase. The list was created based on previous literature and was broadly categorized into action, intention, and belief. Human labels of each animation were then embedded into a semantic vector using Google’s Universal Serial Encoder (USE) language model. We generated three models capturing emotion, interactivity, and mental-state attribution, and correlated each with the semantic similarity structure of the animations. Results. A network frequency analysis showed that participants most saliently identified emotional narratives from the sequences, with nodes of negative emotion and avoidant action appearing as hubs. The semantic structure of the animations as observed by the participants was strongly correlated with the emotion model, followed by the model of interactivity and more weakly by the mentalistic model. Conclusion. Humans are sensitive to perceiving emotional attributes from animations of moving shapes, as compared to action- and belief-based attributes.