Abstract
Daily life requires us to recognize many actions performed by other people. How do we represent such a large number of possible actions in order to efficiently identify them? Prior work suggests that human action representations are organized along a small number of simple dimensions, such as sociality (person-relatedness) and transitivity (object-relatedness). However, most studies have relied on small-scale, controlled stimulus sets to test these dimensions. Here, we curated a naturalistic set of 152 videos sampling a wide range of everyday actions. In an online experiment, 300 participants arranged the videos according to their similarity, broadly defined so as to avoid constraining behavior. We constructed a representational space based on the distances between stimuli using inverse multi-dimensional scaling (Kriegeskorte & Mur, 2012). We then used a data-driven approach, sparse non-negative matrix factorization (NMF; Hoyer, 2004), to investigate the dimensions underlying this behavioral similarity space. By applying sparsity and positivity constraints, NMF can recover continuous interpretable dimensions while also allowing categorical structure to emerge. Nine dimensions were sufficient to reconstruct behavioral responses in held-out data. A separate set of participants labeled the most reliable dimensions as: family; manual labor and chores; communication and office work; locomotion and nature; food and social life; and conflict. These dimensions generalized across different scene settings and action categories within our dataset and were also validated in a separate odd-one-out experiment. While the dimensions were interpretable, none of them mapped onto any previously identified feature space (e.g., sociality or transitivity), highlighting the usefulness of our data-driven approach in generating new hypotheses. Our results suggest that broad distinctions between work, leisure and home life, rather than binary features, organize human action representations. Furthermore, these representations contain more fine-grained information about social relationships and their valence than previously suggested.