September 2021
Volume 21, Issue 9
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2021
How crowding challenges (feedforward) convolutional neural networks
Author Affiliations & Notes
  • Ben Lonnqvist
    Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
  • Adrien Doerig
    Donders Institute for Brain, Cognition & Behaviour, Nijmegen, Netherlands
  • Alban Bornet
    Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
  • Gregory Francis
    Department of Psychological Sciences, Purdue University, West Lafayette, USA
  • Lynn Schmittwilken
    Exzellenzcluster Science of Intelligence, Technische Universität Berlin
  • Michael H. Herzog
    Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
  • Footnotes
    Acknowledgements  BL was supported by the Swiss National Science Foundation grant n. 176153 "Basics of visual processing : from elements to figures".
Journal of Vision September 2021, Vol.21, 2039. doi:https://doi.org/10.1167/jov.21.9.2039
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Ben Lonnqvist, Adrien Doerig, Alban Bornet, Gregory Francis, Lynn Schmittwilken, Michael H. Herzog; How crowding challenges (feedforward) convolutional neural networks. Journal of Vision 2021;21(9):2039. https://doi.org/10.1167/jov.21.9.2039.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Are (feedforward) convolutional neural networks (CNNs) good models for the human visual system? Here, we used visual crowding as a well-controlled psychophysical test to probe CNNs. Visual crowding is a ubiquitous breakdown of object recognition in the human visual system, whereby targets become jumbled and unrecognisable in the presence of flanking objects. Humans exhibit several well-documented effects of crowding, such as invariance to size, where the size of the target and flanker letters may be changed without impacting the strength of crowding. We show that feedforward CNNs are unable to reproduce invariance to size, confusion between target and flanker identities, and importantly uncrowding, where paradoxically increasing the number of flankers improves performance. We investigate this phenomenon using a recurrent, neurally inspired model called LAMINART, which we find can reproduce uncrowding as observed in humans. Furthermore, we show that capsule networks, a recurrent family of CNNs with grouping and segmentation mechanisms, outperform any other models of uncrowding to date, demonstrating the importance of grouping and segmentation in mechanisms in visual information processing in general.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×