Abstract
We see the world in scenes. In spite of the infinitely diverse appearance of these scenes, they typically include contextual associations that make the identity of the objects therein highly predictable. Such associations can give rise to context-based expectations that might benefit recognition of objects within the same setting. For example, seeing a fork will facilitate the recognition of contextually related objects such as a knife and a plate. Building on previous work (Bar, 2003; Bar & Aminoff, 2003), we propose a mechanism for rapid top-down and contextual contributions to object recognition: A blurred, low spatial frequency representation of the input (e.g. a beach scene) is projected early and rapidly from the visual cortex to the prefrontal cortex (PFC) and the parahippocampal cortex (PHC). In the PHC, each image activates an experience-based “guess” about the present context (i.e. a context frame). This information is then projected to the inferior temporal (IT) cortex, where it triggers the activation of the set of object representations associated with the specific context (e.g., a towel, a beach chair, a beach umbrella, a sand castle). In parallel, the same blurred image activates information in the PFC that subsequently sensitizes the most likely candidate interpretations of the target object in IT (e.g., a mushroom, an umbrella, a beach-umbrella, a desk-lamp, a tree). The intersection, in IT, between the representations of the objects associated with the particular context and the candidate interpretations of the target object results, in typical situations, in a reliable selection of a single identity (e.g., a beach-umbrella). This representation is then refined and further instantiated with the gradual arrival of high spatial frequencies. We will outline the logic and discuss behavioral and neuroimaging data that support various aspects of the proposed model.
Supported by R01NS44319, R01NS050615 and James S. McDonnell Foundation #21002039.