Abstract
Convolutional neural networks (CNN) are today the best model at hand to mimic the object recognition capabilities of the human visual system. However, this kind of models lack some biological plausibility in terms of learning and network architecture. With the goal in mind to provide a realistic, but also powerful model to study the function of the human visual system, we propose a biologically-plausible implementation of a deep neural network. We combined, excitatory and inhibitory rate coded neurons in a recurrent network of V1 (L4, L2/3) and V2 (L4, L2/3), with Hebbian synaptic plasticity, intrinsic plasticity, and structural plasticity. The connectivity between layers is modeled based on anatomical data of the neocortical circuit (Douglas & Martin, 2004, Potjans & Diesmann, 2014). The network learns from natural scenes invariance and feature selectivity in parallel. We demonstrate the functioning of the model by three different kind of evaluations: (I) Its object recognition performance on the COIL-100 dataset. We obtained good accuracies (99.18±0.08%), using a SVM with linear kernel on top. The network shows increasing recognition accuracies in deeper layers, matching the hypothesis that the neural code becomes more and more untangled in terms of linear pattern separability (DiCarlo & Cox, 2007). (II) We show that the learned receptive fields fit the physiological data of V1 (Ringach, 2002). (III) The network is demonstrated to match the recent hypothesis that V2 is sensitive to higher-order statistical dependencies of naturalistic visual stimuli (Freeman et al., 2013). We measured the neuronal responses on synthetic naturalistic textures in comparison to spectrally-matched noise and found similar results as in the neurophysiological data: V2-L2/3 neurons prefer naturalistic textures as against spectrally-matched noise, whereas V1 shows no preference. Therefore, we suggest that the functioning and design of the presented model makes it an appropriate platform for studying the visual cortex.
Meeting abstract presented at VSS 2016