Journal of Vision Cover Image for Volume 24, Issue 10
September 2024
Volume 24, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2024
3D shape recognition in humans and deep neural networks
Author Affiliations & Notes
  • Shuhao Fu
    University of California, Los Angeles
  • Daniel Tjan
    Cerritos High School
  • Philip Kellman
    University of California, Los Angeles
  • Hongjing Lu
    University of California, Los Angeles
  • Footnotes
    Acknowledgements  We gratefully acknowledge the generous funding provided by NSF BCS-2142269.
Journal of Vision September 2024, Vol.24, 735. doi:https://doi.org/10.1167/jov.24.10.735
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Shuhao Fu, Daniel Tjan, Philip Kellman, Hongjing Lu; 3D shape recognition in humans and deep neural networks. Journal of Vision 2024;24(10):735. https://doi.org/10.1167/jov.24.10.735.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Both humans and deep neural networks can recognize objects from 3D shapes depicted with sparse visual information, such as a set of points randomly sampled on the surfaces of 3D objects (termed point cloud). Although networks achieve human-like performance for recognizing objects from 3D shapes, it is unclear whether network models acquire similar 3D shape representations to human vision for object recognition. We hypothesize that training neural networks enable the model to gain access to some local 3D shape features and distinctive parts associated with objects, which are adequate to provide good object recognition performance. However, the networks lack representations of the global 3D shapes of objects. We conducted two experiments to test this hypothesis. In Experiment 1, we created Lego-style point clouds to mimic object shapes constructed by Legos. Lego-style 3D objects disrupt local shape features but preserve the global 3D shape of objects. Point clouds of Lego-style objects were shown to both human participants and a dynamic graph convolutional neural network (DGCNN) trained to recognize 3D objects from point cloud displays. Humans maintained high recognition performance when the disruption of local shape was moderate (e.g., the size of Lego pieces was small) (recognition performance for intact 3D shapes: 90% vs. Lego shapes 89%). In contrast, the DGCNN performance dropped significantly, from 90% to 54%. In Experiment 2, we spatially scrambled object parts to disrupt the global 3D shape. We found the opposite result: human recognition performance for part-scrambled displays significantly worsened, but the neural network showed similar recognition performance for the part-scrambled objects when recognizing objects from intact 3D shapes. Hence, the two experiments provide double-dissociation results to show that human object recognition relies on global 3D shapes, but neural networks learn to recognize 3D objects from local shape features.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×