Abstract
Specific regions of ventral temporal cortex (VTC) appear to be specialized for the representation of certain visual categories: for example, the visual word form area (VWFA) for words and the fusiform face area (FFA) for faces. However, a computational understanding of how these regions process visual inputs is lacking. Here we develop a fully computable model that addresses both bottom-up and top-down effects and quantitatively predicts responses in VWFA and FFA (Kay & Yeatman, eLife, 2017). This model is based on measurements of BOLD responses to a wide range of carefully controlled images obtained while subjects perform different tasks on the images. The model shows how a bottom-up stimulus representation is computed, how this representation is modulated by top-down interactions with the intraparietal sulcus (IPS), and how IPS activity is related to the behavioral goals of the subject. We also briefly discuss the broader endeavor of modeling neural information processing and propose principles for assessing and evaluating models (Kay, NeuroImage, 2017).