Abstract
Neurophysiology experiments as well as computational models provide a compelling explanation for the receptive field structures of Primary Visual Cortex (V1). However, a comprehensive explanation of the receptive fields of the Secondary Visual Cortex (V2) still remains far from complete. Recent hierarchical models of V2 provide an appealing avenue for progress. By fixing the computations of lower better-understood visual areas in hierarchical models, as demonstrated for instance by Hosoya and Hyvärinen (2015), the computations and receptive field structures of V2 models can be learned from an ensemble of images and studied independently. In this work, we compared the receptive field representations obtained from two candidate V2 models. First, we implemented a modified version of the model of Hosoya and Hyvärinen (2015) (model M1), in which the first layer model complex cells undergo a significant dimensionality reduction with PCA, followed by an expansive sparse coding (in place of independent component analysis). Second, we modified this pipeline by first incorporating learning of a ubiquitous cortical nonlinearity, namely divisive normalization by the contextual surround, and then followed by pooling, PCA (Coen-Cagli and Schwartz, 2013), and an expansive sparse coding (model M2). Both models were trained on 4 million randomly sampled 48×48 image patches from the ImageNet database distributed for ILSVRC12 (Russakovsky et al., 2015). The learned receptive field structures of M1 included some of the patterns found in Hosoya and Hyvärinen (2015) (e.g. iso-oriented excitation with broad inhibition units) along with circular structure. The incorporation of the contextual divisive normalization further resulted in corner and curvature selective units. We quantified the similarity of the representations by finding for each given unit in one model the highest correlated unit in the other (34.6% of M1 units had matching M2 units with correlation above 0.3). We also qualitatively compared the models with the t-SNE visualization.
Acknowledgement: NSF Award Number: 1715475, NSF GRFP. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. 1451511. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.