September 2018
Volume 18, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2018
Predicting the Behavioral Similarity Structure of Visual Actions
Author Affiliations
  • Leyla Tarhan
    Department of Psychology, Harvard University
  • Talia Konkle
    Department of Psychology, Harvard University
Journal of Vision September 2018, Vol.18, 428. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Leyla Tarhan, Talia Konkle; Predicting the Behavioral Similarity Structure of Visual Actions. Journal of Vision 2018;18(10):428. doi:

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Our visual worlds are filled with other people's actions – we watch others run, dance, cook, and crawl. How is this repertoire of visual actions organized? To approach this question, we obtained behavioral similarity measures over 60 videos depicting everyday actions sampled from the American Time Use Survey (ATUS). We then tested how well we could predict this structure using both high-level feature models and neural responses along the visual system. To obtain a similarity space of actions, 20 subjects arranged the videos so that similar actions were nearby (Kriegeskorte & Mur, 2012). Participants' representational structures were moderately similar (noise ceiling: r=0.29-0.36). To understand what properties characterize this representational space, we compared a range of models, reflecting high-level category information (ATUS labels, e.g. "fitness," "grooming"), mid-level models reflecting the role of body parts, and low-level models capturing more primitive visual shape features (gist). Cross-validated prediction scores revealed that category information and body part involvement predicted behavioral similarity moderately well compared to visual shape features (mean leave-1-out óA: body parts=0.15, category=0.14, action target=0.09, gist=-0.01). Additionally, visual cortex responses to the same videos measured using fMRI (N=13) did not predict similarities well (óA =0.08), indicating that this behavioral similarity space is not directly represented within visual cortex. These patterns of data were robust across two different video sets of the same 60 actions. These results on action similarity echo recent work in both object and scene domains (Jozwick, 2017; Groen, 2017): human similarity judgments are best predicted by higher-level properties related to items' functions (what they are), rather than by the mid-level visual features driving neural responses in the visual system (how they look). Broadly, these findings suggest that explicit similarity judgments may derive from an underlying categorical representation rather from a common multi-dimensional feature space.

Meeting abstract presented at VSS 2018


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.