Abstract
When we witness a pedestrian suddenly turn away as another person approaches, we sometimes have a distinct impression that they are avoiding the approaching person. How do such impressions of social avoidance arise from our visual experience? To address this question, we used a physics-engine to create 288 videos featuring two geometric shapes moving in an environment (in the style of Heider-Simmel animations). During each video, the shapes moved randomly to explore an environment, except when positioned in proximity of each other, one suddenly changed its moving direction. These turns were defined by three visual parameters, randomly sampled for each video: turning angle, movement speed after the turn, and distance between the two shapes at the beginning of the turn. Observers (N=45) reported for each video whether it gave rise to an impression of social avoidance. We developed three computational models to capture observers’ impressions. Model 1 relies on momentary visual cues from the stimuli (the three parameters defined above) and captures observers’ impressions at 67% accuracy. Model 2 uses a recurrent neural network to extract visual features of trajectories within a temporal window of 30 frames around the turning point and achieves similar classification accuracy (64%). Model 3 relies on the visual cues as well as four force-related parameters. Force parameters are estimated from the shape trajectories via the Leonard-Jones potential model from physics: two interacting particles repel each other at close distances, attract each other at moderate distances, and do not interact at large distances. Model 3 achieves the best classification accuracy (74%) in predicting observers’ judgments. These results suggest that when viewing moving agents, people spontaneously infer latent information from motion dynamics, such as “social forces” governing their interactions. This latent information supplements more primitive visual cues to enable rapid impressions of avoidance behavior in social environments.