August 2023
Volume 23, Issue 9
Open Access
Vision Sciences Society Annual Meeting Abstract  |   August 2023
Deep learning classifiers match human accuracies but not the quirks
Author Affiliations
  • Joseph MacInnes
    Swansea University
  • Natalia Zhozhikashvili
  • Kirill Koretaev
    Purple Gaze
  • Feurra Matteo
    HSE University
Journal of Vision August 2023, Vol.23, 5098. doi:https://doi.org/10.1167/jov.23.9.5098
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Joseph MacInnes, Natalia Zhozhikashvili, Kirill Koretaev, Feurra Matteo; Deep learning classifiers match human accuracies but not the quirks. Journal of Vision 2023;23(9):5098. https://doi.org/10.1167/jov.23.9.5098.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Deep learning convolutional neural networks (CNN) have shown impressive results on many computer vision tasks. They have also performed well modelling human vision, leading some to suggest that are inherently good models of human visual processing. Since CNNs are classifiers, they typically transform problems into classification and excel when measuring results as an accuracy score. Less well studied is a CNN’s ability to model human errors, mistakes and other incongruities that people make when interpreting their visual world. We tested CNNs in their ability to model human data for cognitive and neural phenomena that highlighted peculiarities of human vision. Specifically, the ability of a CNN trained on upright faces and houses to model results from the face inversion effect (FIE) and the impact of TMS to face and object recognition. We gathered data from 19 participants performing a matching task for faces or houses. Behavioural conditions included upright and inverted stimuli. TMS conditions included rOFA, rOPA or Sham. Human accuracy scores showed a typical FIE and our TMS manipulation reduced the FIE by impairing identification accuracy of upright faces pairs (although we did not replicate the expected double dissociation produced by Pitcher et al(2011) and Dilks et al (2013)). We trained a series of CNNs on upright faces and houses to match human matching accuracy and further tested them on the same inverted stimuli shown to human participants. While we could easily match human performance on upright faces, none of the networks showed the FIE when tested on inverted stimuli. In fact, the only interaction from a CNN solution was a house inversion effect. We further modified our CNN solutions by perturbing the weights of the mid network layers to simulate the virtual lesioning of the TMS conditions. Again, the CNN lesioning was not able to match human TMS results.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×