September 2018
Volume 18, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2018
Do Deep Neural Networks Suffer from Crowding?
Author Affiliations
  • Gemma Roig
    Center for Brains Minds and Machines, MITISTD, Singapore University of Technology and Design
  • Anna Volokitin
    CVL, ETH Zurich
  • Tomaso Poggio
    Center for Brains Minds and Machines, MIT
Journal of Vision September 2018, Vol.18, 902. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Gemma Roig, Anna Volokitin, Tomaso Poggio; Do Deep Neural Networks Suffer from Crowding?. Journal of Vision 2018;18(10):902. doi:

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Crowding is a visual effect suffered by humans, in which an object that can be recognized in isolation can no longer be recognized when other objects, so called clutter, are placed close to it. In this work, we study the effect of crowding in artificial Deep Neural Networks (DNNs) for object recognition. We analyze both deep convolutional neural networks (DCNNs) as well as an extension of DCNNs that are multi-scale and that change the receptive field size of the convolution filters with their position in the image, called eccentricity-dependent models. The latter networks have been recently proposed for modeling the feedforward path of the primate visual cortex. Our results reveal that incorporating clutter into the images of the training set for learning the DNNs does not lead to robustness against clutter not seen at training. Also, when DNNs are trained on objects in isolation, we find that recognition accuracy of DNNs falls the closer the clutter is to the target object and the more clutter there is. We find that visual similarity between the target and clutter also plays a role and that pooling in early layers of the DNN leads to more crowding. Finally, we show that the eccentricity-dependent model trained on objects in isolation can recognize such target objects in clutter if the objects are near the center of the image, whereas the DCNN cannot.

Meeting abstract presented at VSS 2018


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.