September 2024
Volume 24, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2024
Contrasting learning dynamics: Immediate generalisation in humans and generalisation lag in deep neural networks
Author Affiliations & Notes
  • Lukas S. Huber
    Cognition, Perception and Research Methods, Department of Psychology, University of Bern
    Neural Information Processing Group, Department of Computer Science, University of Tübingen
  • Fred W. Mast
    Cognition, Perception and Research Methods, Department of Psychology, University of Bern
  • Felix A. Wichmann
    Neural Information Processing Group, Department of Computer Science, University of Tübingen
  • Footnotes
    Acknowledgements  This research was funded by the Swiss National Science Foundation (214659 to LSH). FAW is a member of the Machine Learning Cluster of Excellence, funded by the Deutsche Forschungsgemeinschaft under Germany’s Excellence Strategy—EXC number 2064/1—Project number 390727645.
Journal of Vision September 2024, Vol.24, 1005. doi:https://doi.org/10.1167/jov.24.10.1005
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Lukas S. Huber, Fred W. Mast, Felix A. Wichmann; Contrasting learning dynamics: Immediate generalisation in humans and generalisation lag in deep neural networks. Journal of Vision 2024;24(10):1005. https://doi.org/10.1167/jov.24.10.1005.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Behavioral comparisons of human and deep neural network (DNN) models of object recognition help to benchmark and improve DNN models but also might help to illuminate the intricacies of human visual perception. However, machine-to-human comparisons are often fraught with difficulty: Unlike DNNs, which typically learn from scratch using static, uni-modal data, humans process continuous, multi-modal information and leverage prior knowledge. Additionally, while DNNs are predominantly trained in a supervised manner, human learning heavily relies on interactions with unlabeled data. We address these disparities by attempting to align the learning processes and examining not only the outcomes but also the dynamics of representation learning in humans and DNNs. We engaged humans and DNNs in a task to learn representations of three novel 3D object classes. Participants completed six epochs of an image classification task—reflecting the train-test iteration process common in machine learning—with feedback provided only during training phases. To align the starting point of learning we utilized pre-trained DNNs. This experimental design ensured that both humans and models learn new representations from the same static, uni-modal inputs in a supervised learning environment. We collected ~6,300 trials from human participants in the laboratory and compared the observed dynamics with various DNNs. While DNNs exhibit learning dynamics with fast training progress but lagging generalization, human learners often display a simultaneous increase in train and test performance, showcasing immediate generalization. However, when solely focusing on test performance, DNNs show good alignment with the human generalization trajectory. By synchronizing the learning environment and examining the full scope of the learning process, the present study offers a refined comparison of representation learning. Collected data reveals both similarities and differences between human and DNN learning dynamics. This disparity emphasizes that global assessments of DNNs as models of human visual perception seem problematic without considering specific modeling objectives.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×