October 2020
Volume 20, Issue 11
Open Access
Vision Sciences Society Annual Meeting Abstract  |   October 2020
Conjunctive coding of color and shape in convolutional neural networks
Author Affiliations & Notes
  • JohnMark Taylor
    Harvard University
  • Yaoda Xu
    Yale University
  • Footnotes
    Acknowledgements  This work is supported by a National Science Foundation Graduate Research Fellowship (DGE1745303) to JohnMark Taylor
Journal of Vision October 2020, Vol.20, 400. doi:https://doi.org/10.1167/jov.20.11.400
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      JohnMark Taylor, Yaoda Xu; Conjunctive coding of color and shape in convolutional neural networks. Journal of Vision 2020;20(11):400. https://doi.org/10.1167/jov.20.11.400.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Understanding how the visual system conjunctively codes color and shape has long fascinated cognitive psychologists, cognitive neuroscientists, and neurophysiologists. Recent developments in convolutional neural networks (CNNs) provide us with an excellent opportunity to examine how color and shape conjunctions may be encoded in artificial systems only trained to perform object recognition. To determine whether CNNs encode color and shape independently or in an interactive manner, we used representational similarity analysis to characterize the responses of Alexnet, VGG19, Cornet, Resnet, and Googlenet to different objects, each presented in several different colors. Regardless of the CNN examined, we found that whereas lower layers of the CNNs encode colors in a similar manner across different objects, in higher layers the color spaces associated with different objects are more distinct. The converse is also true: early layers encode shape in a more similar manner across colors than later layers. Interestingly, the similarity between the color spaces of different objects was only weakly (though significantly) associated with the objects’ shape similarity. These results held when color and shape similarity were equated, and when uniformly colored “silhouette” images were used instead of naturally textured images. These results demonstrate that rather than being encoded in an orthogonal manner, color and shape processing becomes increasingly interactive in higher layers of a CNN, suggesting that neural networks optimized for object recognition will naturally develop conjunctive coding of color and shape. These results will be compared with those from responses from visual regions in the human brain to test whether a similar conjunctive coding scheme exists in natural visual systems.


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.