September 2019
Volume 19, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2019
Speaking about seeing: Verbal descriptions of images reflect their visually perceived complexity
Author Affiliations & Notes
  • Zekun Sun
    Department of Psychological and Brain Sciences, Johns Hopkins University
  • Chaz Firestone
    Department of Psychological and Brain Sciences, Johns Hopkins University
Journal of Vision September 2019, Vol.19, 242. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Zekun Sun, Chaz Firestone; Speaking about seeing: Verbal descriptions of images reflect their visually perceived complexity. Journal of Vision 2019;19(10):242.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

How does what we say reflect what we see? A powerful approach to representing objects is for the mind to encode them according to their shortest possible “description length”. Intriguingly, such information-theoretic encoding schemes often predict a non-linear relationship between an image’s “objective” complexity and the actual resources devoted to representing it, because excessively complex stimuli might have simple underlying explanations (e.g. if they were generated randomly). How widely are such schemes implemented in the mind? Here, we explore a surprising relationship between the perceived complexity of images and the complexity of spoken descriptions of those images. We generated a library of visual shapes, and quantified their complexity as the cumulative surprisal of their internal skeletons — essentially measuring the amount of information in the objects. Subjects then freely described these shapes in their own words, producing more than 4000 unique audio clips. Interestingly, we found that the length of such spoken descriptions could be used to predict explicit judgments of perceived complexity (by a separate group of subjects), as well as ease of visual search in arrays containing simple and complex objects. But perhaps more surprisingly, the dataset of spoken descriptions revealed a striking quadratic relationship between the objective complexity of the stimuli and the length of their spoken descriptions: Both low-complexity stimuli and high-complexity stimuli received relatively shorter verbal descriptions, with a peak in spoken description length occurring for intermediately complex objects. Follow-up experiments went beyond individual objects to complex arrays that varied in how visually grouped or random they were, and found the same pattern: Highly grouped and highly random arrays were tersely described, while moderately grouped arrays garnered the longest descriptions. The results establish a surprising connection between linguistic expression and visual perception: The way we describe images can reveal how our visual systems process them.

Acknowledgement: JHU Science of Learning Institute 

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.