September 2021
Volume 21, Issue 9
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2021
A computational framework for reconstructing mental representations of natural visual concepts
Author Affiliations & Notes
  • Laurent Caplette
    Yale University
  • Nicholas B. Turk-Browne
    Yale University
  • Footnotes
    Acknowledgements  Funding: NSF CCF 1839308 and FRQNT Postdoctoral Scholarship (Canada)
Journal of Vision September 2021, Vol.21, 2297. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Laurent Caplette, Nicholas B. Turk-Browne; A computational framework for reconstructing mental representations of natural visual concepts. Journal of Vision 2021;21(9):2297.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Revealing the features of mental representations is a longstanding goal of cognitive psychology. Although progress has been made in uncovering some low-level representations, there is currently no general framework for investigating representations of high-level visual concepts. We developed a computational method to parametrically map points in the semantic space of category labels to points in the space of visual features, allowing us to reconstruct the representations of many common visual concepts. Specifically, we synthesized “CNN-noise” images from random features in an intermediate layer of a convolutional neural network (CNN) and asked 100 observers to indicate what they saw in each image. We translated their written responses to vectors in a continuous space using a semantic embedding. We then used regressions to uncover how each CNN feature correlated to each dimension in this semantic space. Using this semantic-visual mapping, we could extract the CNN features associated with any concept. From these features, we could then synthesize an image to visualize the concept’s prototypical representation. We assessed the quality of these reconstructions (e.g., “grass”, “dog”, “night”) in a separate behavioral validation experiment with 35 observers: 252 of 350 reconstructions were recognized significantly better than chance, suggesting that our method succeeded in visualizing mental representations. We then assessed whether we could predict the semantic content perceived by observers in held-out CNN-noise images: we were able to generate labels closer to the true labels than labels generated by the CNN. Our model also explained similarity judgments of written visual concepts better than the semantic embedding. Finally, it explained unique variance in object representations from high-level visual cortex in fMRI, further suggesting that we captured the structure of mental representations. In conclusion, we developed a computational framework to integrate visual features and semantic dimensions, allowing us to reveal the features and structure of visual representations.


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.