Abstract
It has been a long-standing puzzle how brains achieve invariant object learning and recognition in both the biological and computational research communities. Invariant object recognition is computationally challenging, since any individual object can produce a huge number of different views due to variations in object position, scale, pose, and illumination. How does brain solve this problem effortlessly? Li & DiCarlo (2008, Science) have shown how unsupervised natural experience rapidly alters invariant object representations in the inferior temporal cortex. They did this exploiting the fact that, during natural visual experience, objects tend to remain present while object or viewer motion changes their retinal image. In their study, two objects consistently swapped identity across temporally continuous changes in retinal position. A neural model is proposed that quantitatively simulates the Li & DiCarlo data as an expression of how spatial and object attention interact with invariant category learning processes during eye movement search of a scene. This model builds on the recent ARTSCAN model of this process (Fazl, Grossberg, & Mingolla, 2008, Cognitive Psychology), which also simulated reaction time data showing an object advantage during spatial attention shifts (Egly, Driver, & Rafal, 1994, JEP: General; Brown & Denney, 2007, Perception & Psychophysics). The current work clarifies and refines the predicted role of form-fitting spatial attentional shrouds (Tyler & Kontsevich, 1995, Perception) and related mechanisms that regulate persistence of object representations across eye movements during view-invariant object learning.
Supported in part by the National Science Foundation (SBE-0354378).