September 2015
Volume 15, Issue 12
Free
Vision Sciences Society Annual Meeting Abstract  |   September 2015
Computational similarities between visual and auditory cortex studied with convolutional neural networks, fMRI, and electrophysiology
Author Affiliations
  • Alexander Kell
    These authors contributed equally Brain & Cognitive Sciences, MIT
  • Daniel Yamins
    These authors contributed equally Brain & Cognitive Sciences, MIT
  • Sam Norman-Haignere
    Brain & Cognitive Sciences, MIT McGovern Institute for Brain Research, MIT
  • Darren Seibert
    Brain & Cognitive Sciences, MIT McGovern Institute for Brain Research, MIT
  • Ha Hong
    Brain & Cognitive Sciences, MIT McGovern Institute for Brain Research, MIT
  • Jim DiCarlo
    Brain & Cognitive Sciences, MIT McGovern Institute for Brain Research, MIT
  • Josh McDermott
    Brain & Cognitive Sciences, MIT
Journal of Vision September 2015, Vol.15, 1093. doi:https://doi.org/10.1167/15.12.1093
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Alexander Kell, Daniel Yamins, Sam Norman-Haignere, Darren Seibert, Ha Hong, Jim DiCarlo, Josh McDermott; Computational similarities between visual and auditory cortex studied with convolutional neural networks, fMRI, and electrophysiology. Journal of Vision 2015;15(12):1093. https://doi.org/10.1167/15.12.1093.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Visual and auditory cortex both support impressively robust invariant recognition abilities, but operate on distinct classes of signals. To what extent are similar computations used across modalities? We examined this question by comparing state-of-the-art computational models to neural data from visual and auditory cortex. Using recent “deep learning” techniques, we built two hierarchical convolutional neural networks: an auditory network optimized to recognize words from spectrograms, and a visual network optimized to categorize objects from images. Each network performed as well as humans on the difficult recognition task on which it was trained. Independently, we measured neural responses to (i) a broad set of natural sounds in human auditory cortex (using fMRI); and (ii) diverse naturalistic images in macaque V4 and IT (using multi-array electrophysiology). We then computed the responses of each network to these same sounds and images, and used cross-validated linear regression to determine how well each layer of each model predicted the measured neural responses. Each network predicted the cortical responses in its modality well, explaining substantially more variance than alternative leading models. Moreover, for each modality, lower layers of the network better predicted primary cortical responses, while higher layers better predicted non-primary cortical responses, suggestive of hierarchical functional organization. Our key finding is that both the visual network and the auditory network predicted auditory cortical responses equally well in primary auditory cortex and in some nearby non-primary regions (including regions implicated in pitch perception). In contrast, in areas more distant from primary auditory cortex, the auditory network predicted responses substantially better than the visual network. Our findings suggest that early stages of sensory cortex could instantiate similar computations across modalities, potentially providing input to subsequent stages of processing that are modality-specific. We are currently analyzing the auditory network’s prediction of visual cortical responses.

Meeting abstract presented at VSS 2015

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×