Abstract
In visual crowding, the presence of neighboring elements impedes the perception of a target. Crowding is traditionally explained with feedforward, local models. However, increasing the number of neighboring elements can decrease crowding, i.e., lead to uncrowding, which demonstrates the inadequacy of the classic feedforward explanation. Global models are needed, but behavioral experiments alone cannot discriminate between them. Here, we used fMRI to study the effects of (un)crowding on the BOLD response and effective connectivity between visual regions V1 to V4 and the lateral occipital complex (LOC). We tested three experimental conditions: crowding, uncrowding, and no crowding. First, following the standard approach of fMRI crowding studies, we extracted the percent BOLD signal change (PSC) for each condition in each area. We replicated previous results of BOLD attenuation in crowding, beginning in V2 and persisting up the visual hierarchy. However, uncrowding further attenuated the BOLD response, which suggests that PSC is not (monotonically) related to the level of crowding, as commonly assumed. We then used dynamic causal modeling (DCM) and Bayesian model comparison. Specifically, we contrasted top-down, bottom-up and recurrent models. Recurrent models fit the data best in all three experimental conditions, even the simplest no crowding condition. Our results explain the discrepancies between previous fMRI investigations of crowding: in a recurrent visual hierarchy, the crowding effect can theoretically be detected at any stage. Beyond crowding, we demonstrate the need for data-driven models like DCM to understand the complex recurrent processing which presumably underlies perception in general. The DCM framework allows us not only to compare model architectures but also to estimate the computational details of the model in the form of the connection strengths between regions, which can then be used to inform theoretical models.