September 2021
Volume 21, Issue 9
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2021
Emergent dimensions underlying human perception of the reachable world
Author Affiliations & Notes
  • Emilie L. Josephs
    Harvard University
  • Martin N. Hebart
    Max Planck Institute for Human Cognitive and Brain Sciences
  • Talia Konkle
    Harvard University
  • Footnotes
    Acknowledgements  This work was funded by an NIH R21 to TK (R21EY031867), and by a Pershing Square Fund for Research on the Foundations of Human Behavior grant to EJ.
Journal of Vision September 2021, Vol.21, 2154. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Emilie L. Josephs, Martin N. Hebart, Talia Konkle; Emergent dimensions underlying human perception of the reachable world. Journal of Vision 2021;21(9):2154.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Near-scale, reach-relevant environments are the interface of our manual interactions with the physical world. Recent efforts have begun to probe perceptual and neural representations of naturalistic visual experience at this scale (Josephs & Konkle, 2019; Josephs & Konkle, 2020). Here, we use a computational approach to uncover major dimensions that can parameterize these spaces and predict human similarity judgments. In a large-scale online experiment, 1.2 million odd-one-out judgments were obtained on triplets sampled from 987 images of reachable-scale spaces (hereafter “reachspaces”), drawn from 329 different categories (N = 3,112 Turkers). This yielded a partial sampling of the similarity structure among the images. We then generated a Sparse Positive Similarity Embedding (Hebart et al., 2020), which is a predictive model of the full similarity structure, in which each image is formulated as a point in a multi-dimensional space, and the dimensions are inferred given sparse and positive encoding constraints. This procedure yielded a 31-dimensional embedding that predicted odd-one-out judgments with 59.8% accuracy (chance=33%, noise ceiling=66.3%). In a validation experiment (N=322), we fully sampled pairwise dissimilarities among a subset of 45 images, and found that this closely correlated with the dissimilarity structure predicted by the model (r = 0.87). K-Means clustering over pairwise dissimilarities derived from the embedding showed two major distinctions among reachspaces: food-related vs non-food reachspaces, and digital vs analogue reachspaces. Additionally, examination of the 31 dimensions comprising the embedding revealed interpretable attributes, e.g. “entertainment-related” (i.e. chessboard, poker table), “navigation-related” (i.e. steering wheels, cockpits), “storage-related” (i.e. drawers, shelves), and “cluttered” (i.e. messy desks or tables). Overall, these dimensions highlight differences in the functions, affordances, and visual appearances of different reach-relevant spaces and suggest that the similarity structure among reachspaces is related to the actions they support. These results provide a novel accounting of the representational structure of the reachable world.


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.