September 2021
Volume 21, Issue 9
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2021
Representing contextual associations in convolutional neural networks
Author Affiliations
  • Eric Roginek
    Fordham University
  • Shira Baror
    New York University
  • Daniel Leeds
    Fordham University
  • Elissa Aminoff
    Fordham University
Journal of Vision September 2021, Vol.21, 2045. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Eric Roginek, Shira Baror, Daniel Leeds, Elissa Aminoff; Representing contextual associations in convolutional neural networks. Journal of Vision 2021;21(9):2045.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Contextual associations play a significant role in facilitating object recognition in human vision. However, the role of contextual information in artificial vision remains elusive. We aim to examine whether contextual associations are represented in an artificial neural network, and if so, to understand at what layer they potentially have a role. Addressing this, we examined whether objects that share contextual associations (e.g., bicycle-helmet) are represented more similarly in convolutional neural networks than objects that do not share the same context (e.g., bicycle-fork), and further examined where in the network these context-based representational similarities emerge. As a comparison, we also examined the representational similarity of objects that belong to the same category (e.g., two different shoes) in contrast to objects that do not share the same category (e.g., shoe-brush). In a VGG16 neural network trained on ImageNet and focused on object categorization, representational similarity among objects that share a context (N = 70) is substantially higher than similarity among objects that do not share a context. Representational similarities were computed as the correlation between unit responses to pairs of images in and out of context (or category as a comparison). This context-based rise in similarity emerged at very early layers of the network, remarkably, at the same layer that category-based similarity was found. Category-based similarity was significantly larger than context-based similarity throughout the network. Pixel similarities across contextually paired objects were no greater than objects that do not share the same context. Thus, even though the network was designed for categorical object recognition, contextual relationships were evident in the network across early, mid, and late layers. This suggests that context is inherently preserved and represented across the network, and may have a critical role in facilitating object recognition both in humans and in artificial models.


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.