September 2019
Volume 19, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2019
Comparing Search Strategies of Humans and Machines in Clutter
Author Affiliations & Notes
  • Claudio Michaelis
    Centre for Integrative Neuroscience, University of Tuebingen, Germany
  • Marlene Weller
    Centre for Integrative Neuroscience, University of Tuebingen, Germany
  • Christina Funke
    Centre for Integrative Neuroscience, University of Tuebingen, Germany
  • Alexander S. Ecker
    Centre for Integrative Neuroscience, University of Tuebingen, Germany
    Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
  • Thomas S.A. Wallis
    Centre for Integrative Neuroscience, University of Tuebingen, Germany
  • Matthias Bethge
    Centre for Integrative Neuroscience, University of Tuebingen, Germany
    Max Planck Institute for Biological Cybernetics, Tuebingen, Germany
    Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
Journal of Vision September 2019, Vol.19, 309c. doi:https://doi.org/10.1167/19.10.309c
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Claudio Michaelis, Marlene Weller, Christina Funke, Alexander S. Ecker, Thomas S.A. Wallis, Matthias Bethge; Comparing Search Strategies of Humans and Machines in Clutter. Journal of Vision 2019;19(10):309c. doi: https://doi.org/10.1167/19.10.309c.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

While many perceptual tasks become more difficult in the presence of clutter, in general the human visual system has evolved tolerance to cluttered environments. In contrast, current machine learning approaches struggle in the presence of clutter. We compare human observers and CNNs on two target localization tasks with cluttered images created from characters or rendered objects. Each task sample consists of such a cluttered image as well as a separate image of one object which has to be localized. Human observers are asked to identify wether the object lies in the left or right half of the image and accuracy, reaction time and eye movements are recorded. CNNs are trained to segment the object and the position of the center of mass of the segmentation mask is then used to predict the position. Clutter levels are defined by the set-size ranging from 2 to 256 objects per image. We find that for humans processing times increase with the amount of clutter while for machine learning models accuracy drops. This points to a critical difference in human and machine processing: humans search serially whereas current machine learning models typically process a whole image in one pass. Following this line of thought we show that machine learning models with two iterations of processing perform significantly better than the purely feed-forward CNNs dominating in current object recognition applications. This finding suggests that confronted with challenging scenes iterative processing might be just as important for machines as it is for humans.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×