September 2021
Volume 21, Issue 9
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2021
Context Matters: Recovering Human Visual and Semantic Knowledge from Machine Learning Analysis of Large-Scale Text Corpora
Author Affiliations & Notes
  • Marius Cătălin Iordan
    Princeton University
  • Tyler Giallanza
    Princeton University
  • Cameron T. Ellis
    Yale University
  • Nicole M. Beckage
    Intel Labs
  • Jonathan D. Cohen
    Princeton University
  • Footnotes
    Acknowledgements  This work was supported in part by the Intel Corporation, the Templeton Foundation, and by NSF REU award #1757554 to T.G.
Journal of Vision September 2021, Vol.21, 2738. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Marius Cătălin Iordan, Tyler Giallanza, Cameron T. Ellis, Nicole M. Beckage, Jonathan D. Cohen; Context Matters: Recovering Human Visual and Semantic Knowledge from Machine Learning Analysis of Large-Scale Text Corpora. Journal of Vision 2021;21(9):2738. doi:

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Applying machine learning algorithms to automatically infer relationships between concepts from large-scale collections of documents (embeddings) presents a unique opportunity to investigate at scale how human visual and semantic knowledge is organized, including how people judge fundamental relationships, such as similarity between concepts (‘How similar are a cat and a bear?’) and the features that describe them (e.g., size, furriness). However, efforts to date have shown a substantial discrepancy between algorithm predictions and human empirical judgments. Here, we introduce a novel approach of generating embeddings motivated by the psychological theory that semantic context plays a critical role in human judgments (i.e., the topic or domain being considered in the documents, such as descriptions of the natural world vs. writings about travel and transportation). Specifically, we train state-of-the-art machine learning algorithms to generate contextually-constrained embeddings using contextually-relevant text corpora (subsets of Wikipedia containing tens of millions of words). We show that by incorporating insights from human cognition into the training procedure of machine learning algorithms, we can greatly improve their ability to predict empirical visual and semantic similarity judgments and feature ratings of contextually-relevant concepts: our method exceeds 90% of maximum achievable performance in predicting similarity judgments, as well as the best performance to date in predicting feature ratings (e.g., size) for concrete real-world objects (e.g., ‘bear’). Furthermore, our method outperforms models trained on billions of words, which suggests that qualitative, psychologically relevant factors may be as important as sheer data quantity in constructing training sets for use with machine learning methods of investigating cognitive phenomena. By improving the correspondence between representations derived automatically by machine learning methods (embeddings) and empirical measurements of human judgments, the approach we describe helps advance the use of large-scale text corpora to understand the structure of human visual and semantic knowledge.


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.