December 2022
Volume 22, Issue 14
Open Access
Vision Sciences Society Annual Meeting Abstract  |   December 2022
Visual Relations in Humans and Deep Convolutional Neural Networks
Author Affiliations
  • Nicholas Baker
    Loyola University of Chicago
  • Patrick Garrigan
    St. Joseph's University
  • Austin Phillips
    University of California, Los Angeles
  • Philip Kellman
    University of California, Los Angeles
Journal of Vision December 2022, Vol.22, 3391. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Nicholas Baker, Patrick Garrigan, Austin Phillips, Philip Kellman; Visual Relations in Humans and Deep Convolutional Neural Networks. Journal of Vision 2022;22(14):3391.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Deep convolutional neural networks (DCNNs) have attracted considerable interest as models of human perception. Previous research showed that, unlike humans, DCNNs have no sensitivity to global object shape. We investigated whether this limitation, involving spatial relations among parts, may be an instance of a more general insensitivity to abstract visual relations. We tested DCNNs’ learning and generalization of displays involving three relations: Same/Different, Enclosure, and More/Fewer. For Same/Different, we generated 20 shapes and used them in training images, each containing a pair of shapes that was either same or different. ImageNet-trained DCNNs were trained to respond same or different for varied positions and sizes of shapes. We then tested whether learning generalized to new shape pairs. For Enclosure, each training image consisted of one closed contour and 22 open contour fragments with a red dot placed either inside or outside the closed contour. We retrained DCNNs to report whether the dot was inside or outside the closed shape and then tested whether learning generalized to contours with lengths that differed from the training examples. For More/Fewer, we generated pairs of polygons with differing numbers of sides. One polygon was always red while the other had a random non-red color. We trained DCNNs to judge whether the red polygon had more or fewer sides than the other polygon and then tested generalization to polygon pairs with fewer sides. Results: Across all experiments, DCNNs achieved some degree of successful classification in tasks with the given set of training stimuli but showed no evidence of generalization of learning to modestly different cases. The results suggest that the relations were not learned. DCNNs appear to have crucial limitations that derive from their lack of computations involving abstraction and relational processing of the sort that are fundamental in human perception.


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.