Abstract
A distinguishing feature of neurons within cortical area V2 is selectivity for higher-order visual features beyond the localized orientation energy conveyed by area V1. Recently physiology has shown that while single units in area V1 respond primarily to the spectral content of a stimulus, single units in V2 are selective for image statistics that distinguish natural images. Despite these observations, a description of how V2 can achieve higher-order feature selectivity from V1 outputs remains elusive. To study this we consider a two-layer linear-nonlinear network mimicking areas V1 and V2. When optimized to detect a subset of higher-order features, fitted model V2-like units perform computations that resemble localized differences over the space of V1 afferents, computing relative spectral energy within and across the V1 tuning dimensions of space, orientation, and scale. Interestingly, we find these model fits bear strong qualitative resemblance to models trained on data collected from single units in primate V2, suggesting that some V2 neurons are ideal for encoding higher-order features of natural images. Interestingly, it is known that cortical neurons, such as those of V1, exhibit sparse (heavy-tailed) response distributions to natural images, a fact that is believed to reflect an efficient image code. Indeed these idealized V2-like units exhibit sparsity, similar to what is seen in model V1 populations. What we show here is that sparseness itself can encode image content: classifiers trained to detect higher-order image features from a population readout of response sparsity are significantly more efficient when using V2-like units than comparable V1-like populations, requiring fewer observations to achieve the same classification accuracy. Thus, we show that differences over V1 afferent activity yield efficient mechanisms for computing higher-order visual features, providing a justification for receptive field structures observed in neurons within primate area V2.