Abstract
On our daily commutes, we seamlessly weave through crowds, avoiding potential collisions with multiple pedestrians. How do we prioritize which obstacles to avoid? It is possible that we respond to the nearest N obstacles (topological threshold), or all obstacles within a temporal range (visual threshold). We previously described a visual model in which the risk of collision is specified by an obstacle’s change in bearing direction (|ψ'|), and the imminence of collision is specified by its optical expansion (θ'). Here we investigate the number of obstacles avoided by manipulating the visual threshold on (θ'⋅|ψ'|). In a VR experiment, participants avoided one, two, or three moving avatars (1.1m/s), which crossed their path (±112.5°) while walking toward a goal (11m). We compared models with four different thresholds, measuring error as the mean distance between model and human positions: (1) A visual threshold fit to multiple obstacles had the lowest error (θ '⋅|ψ'| = 0.20 deg/s, M = 0.381m, Mdn = 0.267m). (2) A topological threshold for the single next obstacle had the next highest error (θ'⋅|ψ'| = 0.03 deg/s, M = 0.422m, Mdn = 0.298m). (3) A topological threshold for the next two obstacles had even higher error (θ'⋅|ψ'| = 0.03 deg/s, M = 0.463m, Mdn = 0.323m). (4) Our previous visual threshold fit to collisions with a single obstacle had the worst performance (θ'⋅|ψ'| = 0.03 deg/s, M = 0.477m, Mdn = 0.331m). We then simulated previous data on a participant walking through a crowd of criss-crossing avatars (VSS2023) with the same thresholds, and found that the 0.20 deg/s threshold again had the lowest error (M = 0.625m, Mdn = 0.443m); on average, 1-2 obstacles were above threshold. We conclude that a visual threshold that limits the response to moving obstacles provides the most parsimonious model of human collision avoidance.