September 2021
Volume 21, Issue 9
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2021
Towards acquisition of shape bias: Training convolutional neural networks with blurred images
Author Affiliations & Notes
  • Sou Yoshihara
    Graduate School of Imformatics, Kyoto University
  • Taiki Fukiage
    Human Information Science Laboratory, NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation
  • Shin'ya Nishida
    Graduate School of Imformatics, Kyoto University
    Human Information Science Laboratory, NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation
  • Footnotes
    Acknowledgements  This work was supported by JSPS KAKENHI Grant Number JP20H00603.
Journal of Vision September 2021, Vol.21, 2275. doi:https://doi.org/10.1167/jov.21.9.2275
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Sou Yoshihara, Taiki Fukiage, Shin'ya Nishida; Towards acquisition of shape bias: Training convolutional neural networks with blurred images. Journal of Vision 2021;21(9):2275. https://doi.org/10.1167/jov.21.9.2275.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

ImageNet-trained Convolutional Neural Networks (CNNs) classify objects relying more on texture features than on shape features (texture bias), while humans show the opposite shape bias (Geirhos et al., 2019). We suspect that humans’ shape bias may be acquired by experiencing both sharp and blurred images during early visual development (starting from a blurred visual world), and/or in daily life (where optical blurs are often produced by ocular defocus, and atmospheric light scattering). To test this idea, we trained AlexNet with original sharp images (S-Net), with Gaussian-blurred images (B-Net), and with a mixture of blurred and sharp images (B+S-Net). In comparison with S-Net, B-Net showed a higher shape bias, but a lower classification accuracy with sharp images. B+S-Net, on the other hand, showed a higher shape bias, with keeping high classification accuracies with both sharp and blurred images (blur robustness). The degree of shape bias shown by B+S net was not as high as those of humans and AlexNet trained with unnatural Stylized ImageNet (Geirhos et al., 2019), but comparable to that of VOneNet (Dapello et al, 2020). Another training condition simulating the time course of infant development (trained initially with blur images and later with sharp images, B2S-Net) showed intermediate characteristics between S-Net and B+S-Net. B2S-Net might behave more like B+S-Net with additional mechanism to avoid forgetting of early experiences, such as critical periods. To understand how our trainings led to enhanced shape bias and blur robustness, we visualized the receptive fields of the first convolutional layers, and found that spatial frequency tuning was shifted to the lower range for B+S-Net in comparison with S-Net. Furthermore, the representational dissimilarity matrices (RDM) indicated that sharp images and blurred images are represented similarly in higher convolutional layers of B+S-net, suggesting development of frequency-invariant representations by blur mixed training.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×