September 2015
Volume 15, Issue 12
Free
Vision Sciences Society Annual Meeting Abstract  |   September 2015
Learning invariant object representations: asymmetric transfer of learning across line drawings and 3D cues
Author Affiliations
  • Moqian Tian
    Psychology Department, Stanford University, Stanford, CA.
  • Dan Yamins
    Department of Brain and Cognitive Sciences and McGovern Institute for Brain Research, MIT, Cambridge, MA.
  • Kalanit Grill-Spector
    Psychology Department, Stanford University, Stanford, CA. Stanford Neuroscience Institute, Stanford, CA.
Journal of Vision September 2015, Vol.15, 1088. doi:https://doi.org/10.1167/15.12.1088
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Moqian Tian, Dan Yamins, Kalanit Grill-Spector; Learning invariant object representations: asymmetric transfer of learning across line drawings and 3D cues. Journal of Vision 2015;15(12):1088. https://doi.org/10.1167/15.12.1088.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

View invariant object recognition requires binding multiple views of a single object while discriminating different objects from similar views. The capacity for invariant recognition likely arises in part through unsupervised learning, but it is unclear what visual information is used during learning, or what internal object representations are generated. In study 1, subjects were tested on their abilities to recognize novel 3-dimensional objects rotated in depth, before and after unsupervised learning when subjects saw the objects at a variety of angles. Objects were rendered in different visual formats: stereo, shape-from-shading, line drawings, and silhouettes, and trained/tested on the same format. Unsupervised learning produced significant recognition improvement in all conditions, but was substantially less effective for silhouettes than the other three formats (p< 0.01). A computational model of the ventral stream (Yamins, 2014) showed equal improvement across all formats in an intermediate V4-like stage, showing that the less-effective human learning for silhouettes cannot be attributed to a lack of visual information. However, a higher IT-like stage of the same model exhibited a learning curve pattern similar to that in humans. In study 2 we tested whether learning transfers across formats. Subjects participated in unsupervised learning of objects generated from shape-from-shading or line drawings and were tested on objects generated from the same or different cues. While subjects’ performance significantly improved after learning in all conditions, testing performance was better for shape-from-shading than line drawings irrespective of the learning cue (p< 0.02). We replicated these findings in a separate study training and testing with stereoscopic or line drawings of objects. Together these findings indicate that although contours can enable learning invariant representations, structural cues are more effective. Furthermore, our results suggest that learning optimizes internal representations to improve recognition of real 3D objects rather than simply generating associations among trained views.

Meeting abstract presented at VSS 2015

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×