Open Access
Article  |   August 2017
Selectivity, hyperselectivity, and the tuning of V1 neurons
Author Affiliations
  • Kedarnath P. Vilankar
    Department of Psychology, Cornell University, Ithaca, NY, USA
    [email protected]
  • David J. Field
    Department of Psychology, Cornell University, Ithaca, NY, USA
    [email protected]
Journal of Vision August 2017, Vol.17, 9. doi:https://doi.org/10.1167/17.9.9
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Kedarnath P. Vilankar, David J. Field; Selectivity, hyperselectivity, and the tuning of V1 neurons. Journal of Vision 2017;17(9):9. https://doi.org/10.1167/17.9.9.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

In this article, we explore two forms of selectivity in sensory neurons. The first we call classic selectivity, referring to the stimulus that optimally stimulates a neuron. If a neuron is linear, then this stimulus can be determined by measuring the response to an orthonormal basis set (the receptive field). The second type of selectivity we call hyperselectivity; it is either implicitly or explicitly a component of several models including sparse coding, gain control, and some linear-nonlinear models. Hyperselectivity is unrelated to the stimulus that maximizes the response. Rather, it refers to the drop-off in response around that optimal stimulus. We contrast various models that produce hyperselectivity by comparing the way each model curves the iso-response surfaces of each neuron. We demonstrate that traditional sparse coding produces such curvature and increases with increasing degrees of overcompleteness. We demonstrate that this curvature produces a systematic misestimation of the optimal stimulus when the neuron's receptive field is measured with spots or gratings. We also show that this curvature allows for two apparently paradoxical results. First, it allows a neuron to be very narrowly tuned (hyperselective) to a broadband stimulus. Second, it allows neurons to break the Gabor–Heisenberg limit in their localization in space and frequency. Finally, we argue that although gain-control models, some linear-nonlinear models, and sparse coding have much in common, we believe that this approach to hyperselectivity provides a deeper understanding of why these nonlinearities are present in the early visual system.

Introduction
From the first recordings of neurons in the visual pathway (e.g., Barlow, 1953; Hubel & Wiesel, 1962; Kuffler, 1953), it has been recognized that such neurons are selective to particular features of the visual world. Over the last several decades, these neurons have been probed with a wide variety of stimuli, and their selectivity has served as the basis of a number of theories of sensory coding. Despite the wealth of studies, we believe that there remains confusion regarding how selective a neuron is and what a neuron is selective to. These issues of selectivity arise when a neuron's receptive field is used as a predictive model of the neuron's behavior. We are certainly not the first to note this. It is widely understood that if a neuron is nonlinear, then the receptive field is an incomplete model of its response. The list of nonlinearities found in V1 is quite large and includes effects like gain control, cross-orientation inhibition, end stopping, and a variety of interactions outside of the classical receptive field. Each of these nonlinearities is often given its own functional goal (e.g., gain control is for controlling gain, end stopping is for detecting the ends of lines or edges). 
Several efforts have been made to unify these different nonlinearities into a single framework. One approach argues that these nonlinearities serve to control the gain of a neuron (e.g., Pagan, Simoncelli, & Rust, 2016; Schwartz & Simoncelli, 2001; Tolhurst & Heeger, 1997). A second approach argues that nonlinearities provide an efficient sparse overcomplete code (e.g., Golden, Vilankar, Wu, & Field, 2016; Zhu & Rozell, 2013). Following from Zetzsche's work (e.g., Zetzsche, Krieger, & Wegmann, 1999; Zetzsche & Rohrbein, 2001), we have argued that these nonlinearities follow from a relatively simple curvature in the iso-response surfaces of these neurons (Golden et al., 2016). We argued that sparse coding produces this curvature to reduce the redundancy resulting from the nonorthogonal neurons that are produced by using overcomplete codes. 
We will begin this article by contrasting two forms of selectivity: classic selectivity and hyperselectivity. Classic selectivity is defined in terms of the optimal stimulus for a neuron, while hyperselectivity is defined in terms of the falloff in sensitivity as one moves away from that optimal stimulus. We argue that the curvature in the iso-response surfaces provides a measure of hyperselectivity, and that this approach allows us to make a distinction between the stimulus that optimally stimulates a neuron (what it is selective to) and how narrowly tuned that neuron is to that stimulus. We follow this introduction with the forms of curvature produced by various nonlinear models. Some of these models were introduced in our previous work (Golden et al., 2016). However, we extend this discussion to include cascaded linear-nonlinear models (e.g., Pagan et al., 2016; Schwartz, Pillow, Rust, & Simoncelli, 2006). We also focus on how neighboring neurons represent the subspace between any pair of neurons, showing that each model represents this subspace differently. 
In the last sections, we focus on several implications of this curvature. We show that the curvature allows the paradoxical result that a neuron can be narrowly tuned to a broadband stimulus or broadly tuned to a narrowband stimulus. We demonstrate that with this curvature, the receptive field changes with the basis used to measure the neuron. In particular, the spatial-frequency bandwidths are less than those predicted from the receptive field measured with spots or lines (in line with a number of studies). Finally, we show that this form of nonlinearity allows neurons to have joint frequency and space limits that are below the Gabor–Heisenberg limit. The Gabor function (the product of a sinusoid and a Gaussian) has proven to be a popular model of V1 neurons, and it has been argued that such functions are optimal in terms of their joint localization in space and frequency. As we discuss later, the nonlinearities represented by this curvature provide a means of breaking this localization limit. 
The receptive field and selectivity in visual neurons
Many of our concepts of selectivity in visual neurons trace back to the early recordings of visual neurons in retina (Barlow, 1953; Hartline, Wagner, & Ratliff, 1956) and V1 (e.g., Hubel & Wiesel, 1962). These studies describe the responses of visual neurons in terms of each neuron's receptive field. The mapping of the response to a spot or line as a function of position provides a first-order description of the neuron. For a linear neuron, this response is all that is needed to provide a complete description of a neuron's response to any stimulus. The receptive field also describes the stimulus that produces the optimal response, and provides a template that will predict the response to any novel stimulus. For a linear neuron, we can simply treat the neuron as a vector and the response R(S) will simply be the dot product of the receptive field Display Formula\(\def\upalpha{\unicode[Times]{x3B1}}\)\(\def\upbeta{\unicode[Times]{x3B2}}\)\(\def\upgamma{\unicode[Times]{x3B3}}\)\(\def\updelta{\unicode[Times]{x3B4}}\)\(\def\upvarepsilon{\unicode[Times]{x3B5}}\)\(\def\upzeta{\unicode[Times]{x3B6}}\)\(\def\upeta{\unicode[Times]{x3B7}}\)\(\def\uptheta{\unicode[Times]{x3B8}}\)\(\def\upiota{\unicode[Times]{x3B9}}\)\(\def\upkappa{\unicode[Times]{x3BA}}\)\(\def\uplambda{\unicode[Times]{x3BB}}\)\(\def\upmu{\unicode[Times]{x3BC}}\)\(\def\upnu{\unicode[Times]{x3BD}}\)\(\def\upxi{\unicode[Times]{x3BE}}\)\(\def\upomicron{\unicode[Times]{x3BF}}\)\(\def\uppi{\unicode[Times]{x3C0}}\)\(\def\uprho{\unicode[Times]{x3C1}}\)\(\def\upsigma{\unicode[Times]{x3C3}}\)\(\def\uptau{\unicode[Times]{x3C4}}\)\(\def\upupsilon{\unicode[Times]{x3C5}}\)\(\def\upphi{\unicode[Times]{x3C6}}\)\(\def\upchi{\unicode[Times]{x3C7}}\)\(\def\uppsy{\unicode[Times]{x3C8}}\)\(\def\upomega{\unicode[Times]{x3C9}}\)\(\def\bialpha{\boldsymbol{\alpha}}\)\(\def\bibeta{\boldsymbol{\beta}}\)\(\def\bigamma{\boldsymbol{\gamma}}\)\(\def\bidelta{\boldsymbol{\delta}}\)\(\def\bivarepsilon{\boldsymbol{\varepsilon}}\)\(\def\bizeta{\boldsymbol{\zeta}}\)\(\def\bieta{\boldsymbol{\eta}}\)\(\def\bitheta{\boldsymbol{\theta}}\)\(\def\biiota{\boldsymbol{\iota}}\)\(\def\bikappa{\boldsymbol{\kappa}}\)\(\def\bilambda{\boldsymbol{\lambda}}\)\(\def\bimu{\boldsymbol{\mu}}\)\(\def\binu{\boldsymbol{\nu}}\)\(\def\bixi{\boldsymbol{\xi}}\)\(\def\biomicron{\boldsymbol{\micron}}\)\(\def\bipi{\boldsymbol{\pi}}\)\(\def\birho{\boldsymbol{\rho}}\)\(\def\bisigma{\boldsymbol{\sigma}}\)\(\def\bitau{\boldsymbol{\tau}}\)\(\def\biupsilon{\boldsymbol{\upsilon}}\)\(\def\biphi{\boldsymbol{\phi}}\)\(\def\bichi{\boldsymbol{\chi}}\)\(\def\bipsy{\boldsymbol{\psy}}\)\(\def\biomega{\boldsymbol{\omega}}\)\((\overrightarrow {rf} )\) and the visual stimulus S:  
\begin{equation}\tag{1}R(S) = \, \lt \overrightarrow {rf} .S \gt \end{equation}
 
However, no biological neuron is completely linear. Whether we consider simple thresholds, saturating nonlinearities, gain control, or a variety of nonclassical responses, the response of a neuron to complex stimuli like natural scenes can rarely be determined solely from a two-dimensional receptive field (e.g., David & Gallant, 2005; Mante, Frazor, Bonin, Geisler, & Carandini, 2005; Murray, 2011; Olshausen & Field, 2005; Prenger, Wu, David, & Gallant, 2004). A wide variety of attempts have been made to model these nonlinearities, with various degrees of success (e.g., Mély & Serre, 2016; Prenger et al., 2004; Schwartz & Simoncelli, 2001; Tolhurst & Heeger, 1997). In this article, we are not attempting to provide a model that produces a better fit to all the experiments in the literature (see Mély & Serre, 2016). Rather, we have two goals here. First, along the lines of Zetzsche (e.g., Zetzsche et al., 1999; Zetzsche & Nuding, 2005) and Golden et al. (2016), we wish to emphasize that many of these nonlinearities can be described in terms of simple curvature of the response surfaces. Second, we argue that this curvature is responsible for a number of interesting behaviors that we believe provide deeper insights into the function of these nonlinearities. 
A number of studies have focused on the geometry of sensory responses (Field, 1994; Sharpee, 2013; Tsai & Cox, 2015; Yamins et al., 2014; Zetzsche et al., 1999; Zetzsche & Nuding, 2005). The important work using spike-triggered covariance (e.g., Rust, Schwartz, Movshon, & Simoncelli, 2004, 2005; Schwartz et al., 2006; Schwartz, Chichilnisky, & Simoncelli, 2002; Vintch, Movshon, & Simoncelli, 2015) has also provided a means of characterizing the local geometry implied by these nonlinearities. We will return to these results in later sections. Our recent article (Golden et al., 2016) focused on selectivity in sparse coding networks. We argued that overcomplete sparse coding networks produce gain-control-like behavior, end stopping, and other nonlinearities because these nonlinearities are required to produce an efficient overcomplete code. We found that overcomplete sparse coding warps the iso-response surfaces to help isolate the causes and produce a more efficient representation. In the next section, we compare the two forms of selectivity found in sensory neurons and show how some well-documented results are explained by the curvature in the iso-response surfaces. 
Two forms of selectivity
In this section we wish to consider two forms of selectivity. Classic selectivity describes the stimulus that produces the maximum response for a given stimulus magnitude (Smax). If a neuron is linear, then classic selectivity can be determined by measuring the response to a basis set (e.g., spots or gratings). This receptive field describes how the neuron responds as a function of position (and time, color, etc.). Our second form of selectivity, hyperselectivity (Golden et al., 2016), is a measure of how narrowly tuned the neuron is to that optimal stimulus. 
Our goal here is to make a clear distinction between the pattern that optimally stimulates a neuron and the selectivity around that optimal stimulus. A neuron may be optimally stimulated by an image of your grandmother, but if that neuron is linear, it is no more selective than any other linear neuron. However, if the neuron is hyperselective, we argue that the falloff as one moves away from your grandmother is faster than would be expected for a linear neuron. 
State space
One way to describe the differences between linear and nonlinear responses is to consider the response geometry within the image state space. Describing a neuron's behavior in this way can be illuminating but also conceptually difficult. For example, a 100-pixel image is represented by a 100-dimensional state space, where each axis represents the intensity of one pixel. We can also describe a neuron's response within this state space. For example, we might consider the family of stimuli which would produce a particular output in a neuron (e.g., 1 spike/s). If that family forms a surface in the state space, we will call this an iso-response surface. Such a surface in a 100-dimensional space could be quite complex. However, we believe that by considering the behavior of these surfaces in low dimensions, this approach can provide considerable insight into the behavior in higher dimensions. In the following sections we will be considering how neighboring neurons interact and how this affects the iso-response surfaces. 
Angle between two neurons
In this article, we will be considering how a population of linear or nonlinear neurons represents the population of images. We will be considering representations where the neurons are not necessarily orthogonal. A number of our figures will describe the responses of two neurons in a two-dimensional subspace defined by these two neurons. In these figures, we will often be referring to the angle between two neurons. We wish to emphasize that the angle in state space between two neurons θsp (see Equation 2) is not equivalent to the orientation difference between the neurons. Figure 1 shows examples of the angle for four pairs of neurons. The state-space angle represents the overlap between two neurons and is a function of the dot product of the receptive fields Display Formula\((r\vec f1\,{\rm{and}}\,r\vec f2)\). The figure also shows the iso-response contours if the two neurons are linear, which we describe later. Although this two-dimensional figure represents only a slice of the high-dimensional state space, such figures have been shown to produce important insights into the nonlinear behavior of sensory neurons (Golden et al., 2016; Zetzsche et al., 1999; Zetzsche & Nuding, 2005). 
Figure 1
 
In this article we will often refer to the angle between two neurons. We are referring to the angle in image state space. The figure shows four pairs of neurons with their spatial-orientation selectivity difference and the angle between them in 2-D subspace defined by the two neurons in image state space. The angle represents the overlap between the two neurons' receptive fields \((r\vec f1\,{\rm{and}}\,r\vec f2)\), calculated as shown in Equation 2. (a) An example of two neurons with receptive fields at different positions and not overlapping. Such neurons are orthogonal, as represented by two vectors on the right. (b) The receptive fields are overlapping, which produces an angle that is less than 90°. (c–d) Examples of pairs of neurons with overlapping receptive fields but different orientation selectivity. Although the orientation difference is the same in the two cases, the angles between the two neurons in image state space are different. Despite the overlap, the pair of neurons in (d) are nearly orthogonal, as represented on the right.
Figure 1
 
In this article we will often refer to the angle between two neurons. We are referring to the angle in image state space. The figure shows four pairs of neurons with their spatial-orientation selectivity difference and the angle between them in 2-D subspace defined by the two neurons in image state space. The angle represents the overlap between the two neurons' receptive fields \((r\vec f1\,{\rm{and}}\,r\vec f2)\), calculated as shown in Equation 2. (a) An example of two neurons with receptive fields at different positions and not overlapping. Such neurons are orthogonal, as represented by two vectors on the right. (b) The receptive fields are overlapping, which produces an angle that is less than 90°. (c–d) Examples of pairs of neurons with overlapping receptive fields but different orientation selectivity. Although the orientation difference is the same in the two cases, the angles between the two neurons in image state space are different. Despite the overlap, the pair of neurons in (d) are nearly orthogonal, as represented on the right.
In these kinds of plots, there can be confusion regarding what the axes represent. They can represent any two orthogonal stimuli (e.g., pixels, Fourier components, wavelet components). They can also represent the preferred stimuli for a set of neurons. In general, we will be considering how neighboring neurons interact and how the interaction depends on the angle between these neighboring neurons. Figure 1 shows examples where the neurons are linear and the neurons do not interact. As we will see, the nonlinear interactions between neighbors fundamentally alter the responses described in this space.  
\begin{equation}\tag{2}{\theta _{sp}} = \arccos \left({{ \lt r\vec f1.r\vec f2 \gt } \over {||r\vec f1||.||r\vec f2||}}\right)\end{equation}
 
Classic selectivity
Consider a neuron with the receptive field shown in Figure 2. This figure shows the receptive field of a theoretical neuron with a linear response. The receptive field shown on the left is the response to a particular basis set (pixels). The spectrum on the right shows the frequency response of the neuron. This represents the response to grating stimuli (the Fourier basis). If the neuron is linear, then the two response profiles are simply Fourier transforms of each other (assuming we have the full two-dimensional set of Fourier coefficients). 
Figure 2
 
An example of a neuron that would classically be described as narrowly tuned. Such a neuron is well localized in the frequency domain. In this article, we argue that if a neuron is linear, then it is as selective as any other linear neuron. See text for details.
Figure 2
 
An example of a neuron that would classically be described as narrowly tuned. Such a neuron is well localized in the frequency domain. In this article, we argue that if a neuron is linear, then it is as selective as any other linear neuron. See text for details.
This is a neuron that would be classically considered narrowly tuned because it is well localized in the Fourier domain. A wide range of studies has focused on the narrow frequency selectivity of V1 neurons (e.g., De Valois, Albrecht, & Thorell, 1982; Sachs, Nachmias, & Robson, 1971). Many early debates focused on the function of this selectivity with suggestions that varied widely, including the idea that the visual system was attempting a process analogous to the Fourier transform (e.g., see De Valois et al., 1978). Marĉelja (1980) argued that the receptive fields of V1 simple cells showed similarities to the functions described by Gabor (1946)—a Gaussian multiplied by a sinusoid—that minimized the joint selectivity in space and frequency. This suggestion was supported by physiology (Field & Tolhurst, 1986; Jones & Palmer, 1987) and led to one of the standard models of visual processing, where images are represented by arrays of the Gabor functions that vary in location, orientation, and spatial frequency. Gabor and Daugman (Daugman, 1985) argued that these functions are optimally localized, where the localization is defined as the width in space and frequency (ΔX × ΔF). Gabor's work was derived from the Heisenberg uncertainty principle (Heisenberg, 1927) and is often considered to be a fundamental limit to localization. 
We will return to this issue of joint Gabor selectivity when we show that sensory neurons can break this limit. However, it is important to recognize that if a neuron is linear, then it is simply a vector in the high-dimensional image state space. Its selectivity to the set of all possible stimuli is not any greater than that of any other linear neuron. The response to Gaussian white noise would be the same for any linear neuron with the same gain. This classical concept of selectivity depends on only the relationship between the vector and the basis we use to describe that vector. For example, a neuron that is narrowly tuned in the Fourier domain will not be narrowly tuned (well localized) in the pixel domain. If a neuron is linear, then the receptive field defines the stimulus that gives the optimal response and defines how it will respond to all other stimuli (it is simply the projection of that stimulus onto the vector). 
Let us now consider the response of a linear neuron defined in a low-dimensional state space. Figure 3 provides a simple description of such a neuron in a two-dimensional state space. This approach of describing linear and nonlinear systems can be found in Golden et al. (2016), Zetzsche et al. (1999), and Zetzsche and Nuding (2005). For a linear neuron, the family of stimuli that will produce a specific response is defined by a plane that is orthogonal to the preferred stimulus (Smax). In Figure 3a we show a simple two-dimensional example where the iso-response surface is represented by a contour, and different response magnitudes are represented by the equally spaced contours. If the neuron is linear past some threshold, then the response contours will be like that shown in Figure 3b
Figure 3
 
Four basic types of neural-response geometry in a 2-dimensional state space. On the left, the neuron is represented as a vector ([1, 0]) in the 2-dimensional state space. The two dimensions in these figures represent two stimulus directions: D1 represents the direction of the preferred stimulus of the neuron, and D2 represents an orthogonal direction (e.g., Figure 1). For a real neuron this can be considered to be a two-dimensional slice through the high-dimensional response space. Each of the colored lines (orthogonal or curved) is an iso-response contour which represents a set of stimuli in the image state space for which the given neuron responds with the same magnitude. The plots on the right show the response surface, where the z-axis represents response magnitude of the neuron. (a) Response geometry of a linear neuron. (b) Response geometry of a thresholded nonlinear neuron. (c) Response geometry of a compressive nonlinear neuron. (d) Response geometry of a warping nonlinear neuron. This figure is modified from that found in Golden et al., 2016.
Figure 3
 
Four basic types of neural-response geometry in a 2-dimensional state space. On the left, the neuron is represented as a vector ([1, 0]) in the 2-dimensional state space. The two dimensions in these figures represent two stimulus directions: D1 represents the direction of the preferred stimulus of the neuron, and D2 represents an orthogonal direction (e.g., Figure 1). For a real neuron this can be considered to be a two-dimensional slice through the high-dimensional response space. Each of the colored lines (orthogonal or curved) is an iso-response contour which represents a set of stimuli in the image state space for which the given neuron responds with the same magnitude. The plots on the right show the response surface, where the z-axis represents response magnitude of the neuron. (a) Response geometry of a linear neuron. (b) Response geometry of a thresholded nonlinear neuron. (c) Response geometry of a compressive nonlinear neuron. (d) Response geometry of a warping nonlinear neuron. This figure is modified from that found in Golden et al., 2016.
If a neuron has a simple output nonlinearity (e.g., compressive or sigmoidal), then the iso-response surfaces remain planes but the distance between them is altered. Figure 3c shows an example of this nonlinearity. Whether the neuron has a simple threshold or sigmoidal output, the collection of stimuli that will produce the same response are described as planes in directions that are orthogonal to the optimal stimulus. In other words, adding any stimulus orthogonal to the optimal stimulus will have no effect on the output for a neuron with simple output nonlinearities. Since the iso-response surfaces are planes, this family of nonlinearities is described as a planar nonlinearity. Probably the simplest definition of such a neuron is the following. First we define Smax as the stimulus that optimally stimulates the neuron—the stimulus that produces the highest response for all possible stimuli s of the same total energy c
For most of the cases we will be describing here, the optimal stimulus can be derived from the feed-forward weights. Let us consider the case where the feed-forward weights are linear. The optimal stimulus is simply a weighted sum of the inputs:  
\begin{equation}\tag{3}{S_{\rm{max}}} = \sum\limits_{i = 1}^n {{\psi _i}*R({\psi _i})} \end{equation}
where ψ is any orthonormal basis set and R(ψi) is the linear response to stimulus ψi. That is, for a linear neuron, the receptive field will be the same whether one measures the responses with gratings or pixels.  
For a linear neuron, the addition of any stimulus orthogonal to the optimal stimulus will produce no change in response (Equation 4).  
\begin{equation}\tag{4}R({S_{\rm{max}}} + {S_2}) = R({S_{\rm{max}}})\,|\,{S_{\rm{max}}}\, \bot \,{S_2}\end{equation}
 
This will be true even when the neuron has a simple output nonlinearity (threshold, sigmoidal, etc.). This is simply stating that the iso-response surface for a neuron with a simple output nonlinearity remains a plane. For neurons with simple planar nonlinearities, there are a variety of techniques that can be used to characterize the neuron's behavior. For example, with such neurons, reverse correlation techniques are capable of recovering the receptive field and predicting the full response of the neuron to any novel stimulus (De Boer & Kuyper, 1968; Ringach & Shapley, 2004). 
In the next section, we will consider cases where the response R(Smax + S2) is not equal to the response R(S2). 
Hyperselectivity
Our ideas of hyperselectivity follow from our previous work on the geometry of sparse coding networks with overcomplete codes (Golden et al., 2016). That work borrows a number of elements from earlier studies on the curved geometry implied by various known nonlinearities (Field & Wu, 2004; Zetzsche et al., 1999; Zetzsche & Nuding, 2005; Zetzsche & Rohrbein, 2001). We argued that the classic versions of sparse coding (Olshausen & Field, 1996) will produce a curvature in the iso-response surfaces of a neuron if the networks are overcomplete. 
Here, we define hyperselectivity with regard to the optimal stimulus Smax. For any given dimension orthogonal to Smax, the neuron shows hyperselectivity if the response to Smax plus an orthogonal stimulus is less than the response to Smax:  
\begin{equation}\tag{5}R({S_{\rm{max}}} + {S_2}) \lt R({S_{\rm{max}}})\,|\,{S_{\rm{max}}}\, \bot \,{S_2}\end{equation}
 
As we noted in Golden et al. (2016), a number of well-known nonlinearities have this general behavior. For example, with end stopping, a stimulus outside of the classical receptive field (but within the nonclassical receptive field) produces no response. However, if we present a stimulus within the classical receptive field and then add a stimulus outside of the classical receptive field, the neuron may be inhibited. A similar behavior occurs with cross-orientation inhibition. A bar orthogonal to the neuron's preferred orientation may produce no response; however, if we stimulate the neuron with a bar of its preferred orientation and add the orthogonal stimulus, it can inhibit the neuron. We believe that Equation 5 captures this general behavior and implies an underlying response geometry. In the next section, we describe four classes of model that have been used to explain this nonlinear behavior. Although each has important differences, we argue that they succeed because they all have a similar geometry. 
Four approaches to hyperselectivity
Figure 4 describes four approaches that generate hyperselectivity: sparse coding (e.g., Olshausen & Field, 1996), the fan equation (Golden et al., 2016), gain control with divisive normalization (e.g., Heeger, 1992; Schwartz & Simoncelli, 2001), and a recent example of a linear-nonlinear model (Pagan et al., 2016). We believe that this description of the response geometry of neurons provides a useful means of comparing different models and understanding common elements of nonlinear systems that might appear quite dissimilar. In Golden et al. (2016), we described several approaches that produce curvature. In this section, we will begin by discussing the approaches of sparse coding and gain control and then extend these ideas of curvature to linear-nonlinear models that use a quadratic term followed by a linear sum (e.g., Pagan et al., 2016). 
Figure 4
 
The types of curvature produced by four models of V1 nonlinearities. Each approach can produce hyperselectivity of variable magnitude. For each model, we plot the iso-response contours in two dimensions. We show these contours for a single neuron along with the contours for two neurons (second column: orthogonal; third column: nonorthogonal). The four approaches are sparse coding (a–c), the fan equation (d–f), gain control (g–i), and a cascaded linear-nonlinear model (j–l). For the sparse coding and fan-equation models, the curvature depends on the angle between neighboring neurons (angle in image state space). If the neighbors are orthogonal, there is likely to be no or little curvature. For gain control, the curvature depends on whether the neighboring neuron is part of the group involved in divisive normalization. This can produce curvature even in cases where the neurons are orthogonal. Each of these approaches curves the iso-response contours differently. More critically, the grid of the iso-response contours will cover the image space in different ways for each of these models.
Figure 4
 
The types of curvature produced by four models of V1 nonlinearities. Each approach can produce hyperselectivity of variable magnitude. For each model, we plot the iso-response contours in two dimensions. We show these contours for a single neuron along with the contours for two neurons (second column: orthogonal; third column: nonorthogonal). The four approaches are sparse coding (a–c), the fan equation (d–f), gain control (g–i), and a cascaded linear-nonlinear model (j–l). For the sparse coding and fan-equation models, the curvature depends on the angle between neighboring neurons (angle in image state space). If the neighbors are orthogonal, there is likely to be no or little curvature. For gain control, the curvature depends on whether the neighboring neuron is part of the group involved in divisive normalization. This can produce curvature even in cases where the neurons are orthogonal. Each of these approaches curves the iso-response contours differently. More critically, the grid of the iso-response contours will cover the image space in different ways for each of these models.
Figure 4 shows the two-dimensional curvature created by these different approaches. It should be noted that these different approaches can have free parameters that alter the curvature we describe here. Nevertheless, we believe these visuals can be useful in showing the implications of the equations. For each of the different approaches, we show an example of the curvature generated by the model, and two examples of how the iso-response contours interact when two neighboring neurons are orthogonal (second column) and nonorthogonal (third column). 
The first column in Figure 4 shows an example where we have attempted to roughly equate the magnitude of the curvature in the different methods. Each of these approaches can alter the magnitude of the curvature by varying free parameters; however, the form of the curvature differs across the methods. The second column shows an example of the curvature produced when the neighboring neurons are orthogonal. Our goal here is to provide an example of how the iso-response contours of two neighboring neurons are mapped onto the space between them. The third column shows the curvature when the neighbors are 60° apart (i.e., not orthogonal). Again, we wish to emphasize that these models are not always clear in terms of how the nonlinearity behaves with nonorthogonal vectors, but again, we think the comparison can be helpful. 
Our first approach is overcomplete sparse coding (e.g., Olshausen & Field, 1996). The curvature associated with sparse coding was described by Golden et al. (2016) and is summarized in the first row of Figure 4. Typically, with sparse coding solutions, one focuses on the receptive fields that are learned (the bases). However, here we are focusing on the response geometry within the subspace defined by a family of neurons. Figure 4 shows the response magnitudes of neurons defined by the vectors (the bases). 
In sparse coding, the bases and the responses are learned by minimizing the following energy function:  
\begin{equation}\tag{6}E = [{\rm{preserve\ information}}] + \lambda \times [{\rm{sparseness\ of}}\ {a_i}]\end{equation}
 
\begin{equation}\tag{7}E = {1 \over 2}|I - \phi A{|^{2|}} + \lambda \sum\limits_i {S({a_i})} \end{equation}
where I is the input image patches, ϕ is basis matrix with feedforward weights of all the neurons in the network, λ is tradeoff parameter between the sparsity of the network and the reconstruction error, ai is the response activity of ith neuron, and S(ai) is the cost function. Sparse coding attempts to both minimize the reconstruction error (preserve information) and minimize the cost of response activities ai. With sparse coding, we have found that the curvature is a function of the angle between neighboring neurons. When the angles between neighbors are 90° (orthogonal), there is little or no curvature. As the angle between neighboring neurons decreases, the curvature increases accordingly. We have found (Golden et al., 2016) that the equation of a folding fan is a good approximation to the curvature produced by sparse coding when applied in two-dimensional space—that is, if orthogonal lines are drawn on an open fan and the fan is closed. The equation of a folding fan is  
\begin{equation}\tag{8}\eqalign{ {a_i} = {f_{Fan}}(c,\theta ) = c \times \cos (n({f_i},{f_j})\theta ) \cr \,\,n({f_i},{f_j}) = {{\pi /2} \over {\arccos \left({{\left\langle {{f_i},{f_j}} \right\rangle } \over {||{f_i}||\,||{f_j}||}}\right)}} \cr} \end{equation}
 
where c is the distance of a stimulus from the origin (i.e., the stimulus contrast), is the angle between a stimulus and the neuron, ai is the response magnitude of a neuron i, n determines the curvature and fi and fj are the vectors representing the two neurons. n is a function of the angle between neighboring neurons. When n = 1 the iso-response contours are flat (e.g., linear); when n > 1 the neuron has iso-response contours with exo-origin curvature and when n < 1 the neuron has iso-response contours with endo-origin curvature. 
The second row in Figure 4 shows the behavior of the fan equation in two dimensions. As can be seen, for both sparse coding and the fan equation, the curvature is quite similar. There are a few points worth noting. 
  •  
    For both sparse coding and the fan equation, the curvature is mostly near the vector representing the optimal stimulus Smax of a neuron. Away from the vector, the iso-response surfaces flatten out and become parallel to the vectors representing the neighboring neurons.
  •  
    The curvature is determined by the angle between neighboring neurons (Golden et al., 2016). If two neighboring neurons are orthogonal, then there is no curvature in the subspace between them. A smaller angle produces higher curvature, as shown in Figures 4c and 4f. Using a high-dimensional overcomplete sparse coding network (e.g., 256 pixels), we found in our previous work that when the network is applied to natural scenes, the curvature does predict the angle between neurons, although the resulting curvature was less than predicted by the fan equation (Equation 8). We are currently investigating whether the reduced curvature is due to a limitation in the sparse coding network or is an efficient component of the network.
  •  
    For any given stimulus, the ratio of the responses Display Formula\(({\textstyle{{a1} \over {a2}}})\) of the different neurons to a particular stimulus is independent of the contrast of the stimulus. That is, despite the nonlinearities, the relative activities of the neurons responding to a stimulus do not change as the contrast changes (from c1 to c2):
 
\begin{equation}\tag{9}{{{a_1}} \over {{a_2}}}|c1 = {{{a_1}} \over {{a_2}}}|c2\end{equation}
 
  •  
    The degree of hyperselectivity is roughly constant at different response magnitudes (the iso-response curves are roughly shifted versions of one another).
The first point has the advantage that it makes a clear prediction: If the fan equation were a complete account of curvature in V1, then the angle between neighboring neurons would predict the magnitude of hyperselectivity in visual neurons. Two orthogonal neurons would not curve the response space between them. The disadvantage of this approach is that it does not allow the curvature to be fine-tuned to the statistical redundancy between visual neurons. The sparse coding network does optimize the directions of the vectors (i.e., the receptive fields) based on the statistical properties of the signal (e.g., natural scenes). However, once vectors have been selected, the curvature depends on only the angles between neurons. The approach also does not provide an account of invariance or tolerance (endo-origin curvature) as is found with complex cells. Finally, the sparse coding approach does not produce saturation. The approach can be modified to include a saturating nonlinearity (see Golden et al., 2016). However, it is not a natural component of the model. 
One can ask whether it is possible to generalize the fan equation to high dimensions. We are currently working on this problem. Although we believe it is possible to write the required equations for a high-dimensional fan in a particular subspace, the tiling of the space is a much more difficult problem. What we find interesting is that the sparse coding algorithm does converge on a tiling solution. We are currently trying to determine whether the solution it finds is a relatively efficient one. 
Gain control
Gain-control-like behavior has been found to be present in a wide variety of sensory neurons (Geisler & Albrecht, 1992; Rabinowitz, Willmore, Schnupp, & King, 2011; Schwartz & Simoncelli, 2001). Although this can be modeled in a variety of ways, here we will focus on divisive gain control, described by Heeger (1992):  
\begin{equation}\tag{10}resp = {{{r_1}} \over {{{{r_1} + {r_2}} \over 2} + 1}}\end{equation}
where r1 and r2 are the squared linear responses of two neurons.  
Figure 4 shows the divisive-gain-control model for a simple two-neuron system with two orthogonal neurons. In this model, the output of any neuron is reduced by the activity of a set of surrounding neurons. In Equation 10, the divisor includes the activity of the primary neuron. This insures that the response saturates. The difficulty with the gain-control model is that it does not explicitly describe which neurons are involved in the division. In general, it would include neurons that overlap in their receptive fields, but it can include neurons that are orthogonal to the primary neuron. 
One should note that with this equation (unlike sparse coding and the fan equation), the curvature increases with increasing response magnitude. That is, the neuron becomes more hyperselective with increasing response magnitude responding to less of the full response space. In the second column, we show an example of the curvature that is produced when the two neurons are orthogonal. Unlike the fan equation, the division will create curvature even in orthogonal conditions. Division in nonorthogonal conditions (shown in the right column) can increase the magnitude of the curvature (make the neuron even more hyperselective). 
There are a number of variations of the gain-control model. One of the more interesting models is that of Schwartz and Simoncelli (2001), where the choice of neurons involved in the inhibition (and the magnitude of the inhibition) was learned in order to minimize the redundancy with the neighboring neurons. It has been noted that the 2-D probability density function for neighboring neurons is kurtotic and often circular (Zetzsche et al., 1999)—it is not the star shape one would expect if the pair of neurons were independent. The curvature in the iso-response lines that we see in the middle row of Figure 4b will distort a circular probability density function and push the pair of neurons to be more independent (Schwartz & Simoncelli, 2001). The fan equation shown in Figure 4e will not produce this distortion (a circular probability density function will remain circular). We believe this is an important topic, and we are currently investigating it, but it is beyond the scope of this article. However, our opinion is that the nonlinear distortion created by this gain-control curvature is not the appropriate approach for removing this form of dependency. It should also be noted that like sparse coding, divisive inhibition will not produce tolerance or invariance. 
In this article, we are not arguing that physiology provides clear support for one of these four models. Our main goal here is to show the similarities and differences among these different approaches. Each of the models described here curves the iso-response surfaces and can create hyperselectivity. However, each model forms this curvature in a different way. 
Quadratic curvature and linear-nonlinear models
A third type of curvature is the family of quadratic curvatures generated by the equation  
\begin{equation}\tag{11}resp = ar_1^2 + br_2^2 + c{r_1}{r_2} + d{r_1} + e{r_2} + f\end{equation}
where r1 and r2 are linear responses of two neurons and a, b, c, d, e, and f are free parameters of the quadratic model. This will generate a wide family of curves that are described by hyperbolas and ellipses in lower dimensional subspaces. A more restricted family of quadratic curves is represented by  
\begin{equation}\tag{12}resp = ar_1^2 + br_2^2{\rm .}\end{equation}
This family of curves will also create ellipses and hyperbolas in the lower dimensional subspaces; however, with this equation the curvature will be symmetric around the bases (r1, r2, etc.). This is an interesting family of curves due to their algebraic simplicity. Figure 4 shows an example of the hyperselective curve that is produced by Display Formula\(resp = r_1^2 - r_2^2\) for the two-dimensional example (we are showing only the regions where the neuron produces positive activity).  
Pagan et al. (2016) have recently created a cascaded linear-nonlinear model (Schwartz et al., 2006) that produces a family of quadratic curves like that shown in Equation 12. Their model uses back propagation to learn a set of weights under the constraints imposed on the linear-nonlinear-linear model. The first stage is linear and can be considered to be a linear V1-like layer with Gabor-like receptive fields (although these are learned and will depend somewhat on the images that the network is taught to classify). The second layer is a simple output nonlinearity: a squaring operation. At this stage, there is no curvature. This is a simple planar nonlinearity, with all iso-response surfaces remaining flat. The next stage is a simple sum (or difference) of these squared outputs. Zetzsche and Barth (1990) and Zetzsche and Rohrbein (2001) have noted that this relatively simple combination of linear and nonlinear layers may provide a means of creating this hyperselective curvature. 
The curvature for any given neuron can be in the form of ellipses when the squared terms are summed (endo-origin curvature) or in curvatures away from the origin (exo-origin curvature) like those shown in Figure 4j through 4l. With both sums and differences, and with different gains for each term (see Equation 12), it is possible for any neuron to have a combination of curvatures allowing it to be hyperselective in some dimensions and invariant or tolerant in others. However, the quadratic terms do constrain the form of the curvature and will not produce the curvatures produced by either the fan equation or divisive gain control. In Figure 4j we focus on the curvature that allows hyperselectivity. The simple difference of squared terms creates a hyperbolic surface like that shown. 
Although this approach can create curvatures that produce hyperselectivity similar to the previous approaches, there are important differences. The quadratic approach has the advantage that it can generate both hyperselective (exo-origin) and invariant or tolerant (endo-origin) response curvatures. As can be seen, the curvature does not flatten out like we see with sparse coding and the fan equation (i.e., the iso-response contour does not become parallel to the adjacent vectors). There is also no strict relationship between the angle of neighboring neurons and the degree of curvature. In Figures 4j through 4l, we show examples of how neighboring neurons might interact. However, the advantage of the approach described by Pagan et al. (2016) is that the curvature can be adapted to meet the needs of the deep network (i.e., effective classification). One other aspect of this approach is that the response of a first layer neuron to its stimulus must increase with the square of the input contrast. This does not appear to be biologically plausible. It is certainly possible to modify the output of the second-layer neurons to produce a saturating response, but it is not an inherent part of the model. One major advantage of this approach is that it allows curvature to be produced in an entirely feed-forward model. Models that produce curvature by inhibition (e.g., division) from neighboring neurons have the difficulty that the reciprocal inhibition can easily create a temporal instability. The feed-forward models circumvent this problem. However, it remains to be seen which is biologically correct. We are currently looking at ways that the fan equation might be implemented using a feed-forward network. 
In this section, we are not making a strong claim for any one kind of curvature. Although we are fond of the fan equation, our goal is to show that these approaches have a number of similarities as well as important differences. We think a future model of biological neurons may include components of each of these. Each of these can create hyperselectivity. The fan equation allows the root-mean-square response magnitude to be constant for stimuli of constant contrast and appears close to what we see in sparse coding networks. However, the gain-control model has been used to model a wide variety of physiological results (e.g., Mély & Serre, 2017; Tolhurst & Heeger, 1997), although not invariance. The cascaded linear-nonlinear model (e.g., Pagan et al., 2016; Schwartz et al., 2006; Vintch, Zaharia, Movshon, & Simoncelli, 2012) has the advantage that it can produce both hyperselectivity and invariance (i.e., tolerance). 
The obvious question is whether the physiological data support any particular one of these models. We feel that the current data are inconclusive. The most promising approach is found in articles that use spike-triggered covariance to probe the subspace between interacting neurons (e.g., Rust et al., 2004, 2005; Schwartz et al., 2002; Schwartz et al., 2006; Vintch et al., 2015). This approach allows one to extract both inhibitory- (hyperselective) and excitatory-interaction (invariant) dimensions. In this article, we are focusing on the inhibitory subspaces. Results when the technique is applied to V1 neurons (e.g., Chen, Han, Poo, & Dan, 2007; Rust et al., 2005; Vintch et al., 2015) are inherently noisy, but the work demonstrates that most neurons have inhibitory directions. As Vintch et al. (2015) have noted, the orthogonal directions revealed by spike-triggered covariance do not necessarily represent the neurons that are involved in the inhibition. Indeed, with the fan equation and sparse coding, the hyperselectivity occurs only when the neighboring neurons are nonorthogonal. Nevertheless, some of these results could be used to compute the iso-response contours for the inhibitory subspace (e.g., Rust et al., 2005, figure 5c; or Schwartz et al., 2006, figure 12) for a model neuron. 
In the next section, we consider three implications of this curvature. First, we make the distinction between the stimulus that optimally stimulates (Smax) a neuron (what a neuron is selective to) and how selective the neuron is to that stimulus. We argue that without this distinction we end up with an apparent paradox: that a neuron can be very narrowly tuned to a broadband stimulus. Second, we show that with this curvature, the optimal stimulus will not be the same as the receptive field. That is, the receptive field depends on the basis set used to measure it. Finally, we show that if we compare the localization of the neuron in the spatial-frequency domain and the space domain, neurons with hyperselectivity can violate the Gabor localization limit. 
Comparing selectivity and hyperselectivity
It is common to describe the selectivity of a neuron by showing what it is selective to. However, this ignores the question of how selective the neuron is to this stimulus. Consider Figure 5a, which shows the results of a sparse coding network. This shows a portion of the receptive fields learned by a 1.3-times overcomplete network (e.g., Olshausen & Field, 1996), and Figure 5d shows some of the receptive fields learned by a 13-times overcomplete network . As Olshausen (2013) has noted, the receptive fields in these highly overcomplete codes show selectivity to a number of features not found in the less overcomplete sparse codes. While the standard sparse coding produces Gabor-like receptive fields, the highly overcomplete codes show a variety of other forms (e.g., spots, curves, and plaids). This is certainly interesting, but the nonlinearities (e.g., the magnitude of hyperselectivity) that these codes show are missing from these simple descriptions showing the 2-D receptive fields. 
Figure 5
 
(a, d) Receptive fields learned from a 1.3-times and a 13-times overcomplete sparse coding network, respectively. Classically, these receptive fields provide the primary way of describing the outputs of these networks. The 13-times overcomplete network, for example, produces more complex receptive fields that include plaids, spots, and curves. However, we argue that the nonlinearities in these codes change in consistent ways as the network becomes overcomplete. In particular, the overcomplete networks become more hyperselective. In a critically sampled code, the majority of neurons will be nearly orthogonal with their neighbors. In such a case, there is little curvature. (b) An example of the iso-response contours when the neighbors are orthogonal. In an overcomplete network there are more neurons than dimensions (e.g., pixels). This forces the angles between many neurons to be less than 90°. (c) The curvature in 2-D space when there are four neurons representing that space. (e) The curvature changes as the sparse coding network become more overcomplete. For this figure, we trained a sparse coding network on 8 × 8 natural-scene image patches. We varied the overcompleteness of the network from 1.3 times (e.g., Olshausen & Field, 1996) to 13 times. We then measured the curvature for the 2-D subspace defined between any pair of neurons in the network. For all of these networks, the majority of pairs will be orthogonal. We therefore measured the curvature for only the five neurons with the most overlap for each neuron in the network (i.e., the five neurons with the smallest angle in the image space). See text for details. (e) The average curvature as a function of overcompleteness. The figure also shows the average smallest angle of these five closest bases for each basis as function of overcompleteness. As the network becomes more overcomplete, the curvature between neighbors increases (i.e., the network becomes more hyperselective).
Figure 5
 
(a, d) Receptive fields learned from a 1.3-times and a 13-times overcomplete sparse coding network, respectively. Classically, these receptive fields provide the primary way of describing the outputs of these networks. The 13-times overcomplete network, for example, produces more complex receptive fields that include plaids, spots, and curves. However, we argue that the nonlinearities in these codes change in consistent ways as the network becomes overcomplete. In particular, the overcomplete networks become more hyperselective. In a critically sampled code, the majority of neurons will be nearly orthogonal with their neighbors. In such a case, there is little curvature. (b) An example of the iso-response contours when the neighbors are orthogonal. In an overcomplete network there are more neurons than dimensions (e.g., pixels). This forces the angles between many neurons to be less than 90°. (c) The curvature in 2-D space when there are four neurons representing that space. (e) The curvature changes as the sparse coding network become more overcomplete. For this figure, we trained a sparse coding network on 8 × 8 natural-scene image patches. We varied the overcompleteness of the network from 1.3 times (e.g., Olshausen & Field, 1996) to 13 times. We then measured the curvature for the 2-D subspace defined between any pair of neurons in the network. For all of these networks, the majority of pairs will be orthogonal. We therefore measured the curvature for only the five neurons with the most overlap for each neuron in the network (i.e., the five neurons with the smallest angle in the image space). See text for details. (e) The average curvature as a function of overcompleteness. The figure also shows the average smallest angle of these five closest bases for each basis as function of overcompleteness. As the network becomes more overcomplete, the curvature between neighbors increases (i.e., the network becomes more hyperselective).
Figure 5b shows a 2-D representation of the curvature when the bases are orthogonal. When the network is 1.3-times overcomplete the majority of the bases are orthogonal; therefore, there is little curvature in such a network. However, as the network becomes more overcomplete, the angles between neighbors decrease and the sparse coding network curves the iso-response surfaces to reduce the redundancy produced by overcompleteness (Golden et al., 2016). Figure 5c shows a 2-D example where the network is 2.6 times overcomplete, and this results in significant curvature. It should be noted that in high dimensions, doubling the number of bases reduces the angle between the bases by a relatively small amount. This implies that the curvature will increase by only a relatively small amount. Figure 5e describes the angle and curvature produced in networks as a function of the degree of overcompleteness. For this figure we used 8 × 8 natural-scene image patches to train 64-D sparse coding networks with overcompleteness ranging from 1.3 to 13 times. In each network we computed the angle between each pair of the learned bases, and the curvature of the iso-response contour in the 2-D subspace defined by the base pair. To measure the curvature we fitted the iso-response contour with a second order polynomial (ax2 + bx + c), and the magnitude of the curvature was the magnitude of the coefficient a. Figure 5e shows the average curvature of the five closest bases to each base (i.e., the mean curvature of the most curved regions of each neuron's response surface). We also plot the average angle of these five closest bases (angle plotted on the right axis). These results show that with an increase in the degree of overcompleteness, the angle between neighbors decreases and the curvature (hyperselectivity) increases. 
As the networks become more overcomplete, the neurons in the network become more hyperselective. It is interesting that the overcomplete networks become selective to new types of stimuli; however, we believe that it is the hyperselectivity that is more interesting. This is not represented by the receptive fields. 
Can a neuron be narrowly tuned to a broadband stimulus?
Without the distinction between classic selectivity and hyperselectivity, we can end up with what may seem to be a paradox. Consider a neuron that has an optimal stimulus, like that shown in Figure 6a. This is simply a broadband Gabor function. Classically, such a neuron would be considered to be broadly tuned because of its bandwidth in spatial frequency and orientation. If we measure the response to gratings of different orientations we get curves like those shown in red in Figures 6b and 6c. If the neuron is linear, then the optimal stimulus for this neuron would be a Gabor function matched to this receptive field (i.e., the maximum response for a given contrast—Smax—would be a stimulus that matched the receptive field). 
Figure 6
 
Hyperselectivity can create a neuron that is narrowly tuned to a broadband stimulus. (a) The receptive field of a neuron that would be classically defined as broadband. For example, such a neuron would respond to a range of orientations. The orientation tuning is shown by the red curve in (b) and (c). The green curves in (b) and (c) show the effects of the nonlinear inhibition by two neighbors with similar orientations. The response of the neuron was modeled by using the fan equation (although any of the models described in Figure 4 would produce this effect). (e–f) The curvature that is produced with these neighbors. With this curvature, the optimal stimulus is unchanged; however, it responds less to nearby orientations. If the neuron is mapped with stimuli of different orientations, then it will appear to be narrowband. However, its preferred stimulus has not changed. The curvature produced by these neighboring neurons' interaction allows the neuron to be highly selective to this broadband stimulus—that is, its optimal stimulus Smax is still the broadband Gabor function shown in (a).
Figure 6
 
Hyperselectivity can create a neuron that is narrowly tuned to a broadband stimulus. (a) The receptive field of a neuron that would be classically defined as broadband. For example, such a neuron would respond to a range of orientations. The orientation tuning is shown by the red curve in (b) and (c). The green curves in (b) and (c) show the effects of the nonlinear inhibition by two neighbors with similar orientations. The response of the neuron was modeled by using the fan equation (although any of the models described in Figure 4 would produce this effect). (e–f) The curvature that is produced with these neighbors. With this curvature, the optimal stimulus is unchanged; however, it responds less to nearby orientations. If the neuron is mapped with stimuli of different orientations, then it will appear to be narrowband. However, its preferred stimulus has not changed. The curvature produced by these neighboring neurons' interaction allows the neuron to be highly selective to this broadband stimulus—that is, its optimal stimulus Smax is still the broadband Gabor function shown in (a).
Now consider the case where that neuron is flanked by similar neurons that are selective to slightly different orientations (either neighbors at 60° and 120° or neighbors at 75° and 105°). If the central neuron is inhibited by the surrounding neurons (with the exo-origin curvature described before), the optimal stimulus would be unchanged (the neuron's preferred stimulus is a broadband Gabor function). However, changes to that stimulus (e.g., small changes to the orientation) would shut the neuron off (a neighboring neuron would then dominate the response and inhibit the primary neuron). Such a neuron would be very narrowly tuned to this broadband stimulus. 
Mapped with gratings of different orientations, the neuron may look as though it is very selective to orientation. However, that tuning to orientation would be a poor predictor of the optimal stimulus for the neuron (a function that is broadly tuned for orientation). The green curves in Figures 6b and 6c show the selectivity as a function of orientation. If we look at this selectivity, we might presume that the optimal stimulus would have a narrow orientation bandwidth. However, the optimal stimulus is not altered. 
With this kind of nonlinearity, the basis set one chooses to measure the neuron can influence the apparent preferred stimulus for the neuron. That is, the apparent receptive field changes with the basis set. This general problem has been noted before with V1 neurons. If a neuron's response is measured with pixels or lines and then is measured with gratings, the tuning of the neuron is not equivalent (e.g., Tadmor & Tolhurst, 1989; Tolhurst & Heeger, 1997). Typically the receptive field is smaller in space (when measured with bars or spots) than is expected from the response to gratings (where the receptive field is the Fourier transform of the spatial-frequency response). 
Generally it is accepted that some form of nonlinearity is needed to explain this effect. Threshold nonlinearities (see, e.g., Tadmor & Tolhurst, 1989; Tolhurst & Heeger, 1997), contrast gain control (e.g., Tolhurst & Heeger, 1997), frequency suppression (De Valois et al., 1985), and some form of surround suppression (e.g., Nestares & Heeger, 1997) have all been proposed. Olshausen and Field (1997) have demonstrated that the sparse coding network will also produce this effect. They provided results (figure 10) that described both the feed-forward weights for a neuron as well as the receptive fields that would result if those neurons were mapped with single pixels or gratings. The receptive fields differed significantly. 
We show more extensive examples of this behavior and quantify the results in Figure 7. This shows the result with a 2.6-times overcomplete network trained on natural scenes using an absolute-value cost function (see Equation 7). 
Figure 7
 
(a) The basis functions (feed-forward weights) learned using a 16 × 16 sparse coding network on natural-scene images. (b) The receptive fields as response profile mapped using spots. (c) The receptive fields reconstructed from the inverse Fourier transform of the frequency response (the response to gratings). The receptive fields from spots are consistently smaller than the basis functions, and the receptive fields from gratings are consistently larger than the basis functions (i.e., the bandwidths are narrower).
Figure 7
 
(a) The basis functions (feed-forward weights) learned using a 16 × 16 sparse coding network on natural-scene images. (b) The receptive fields as response profile mapped using spots. (c) The receptive fields reconstructed from the inverse Fourier transform of the frequency response (the response to gratings). The receptive fields from spots are consistently smaller than the basis functions, and the receptive fields from gratings are consistently larger than the basis functions (i.e., the bandwidths are narrower).
Notice in Figure 7a that the basis function (representing the feed-forward weights) is significantly larger that the receptive fields measured with spots. The basis functions are also significantly smaller than when measured with gratings. This is the general result found by Tolhurst and others in V1 neurons. It is the result of the curvature produced with the overcomplete sparse codes. 
In Figure 8 we quantify these results. Figure 8b describes the average differences between the receptive fields as measured by the angle in state space (an angle of zero means the receptive fields are identical, while an angle of 90° means the receptive fields are orthogonal). The left bar of Figure 8b represents the angular difference between the feed-forward basis and the receptive field as measured with spots (38°), the middle bar represents the difference as measured with gratings (50°), and the right bar represents the angular difference between the receptive field as measured with spots and with gratings (55°). Figure 8a provides a graphical representation of these angular differences in three dimensions. 
Figure 8
 
(a) The misestimate in finding the receptive field of a neuron in a sparse coding network when mapped with spots (pixels) and gratings. The figure shows visually how far the receptive-field estimates are from the true optimal stimulus Smax. (b) The average angle between the three vectors (basis, receptive field from spots, and receptive field from gratings shown in Figure 7). (c) The relative response of each neuron to stimuli that match either the basis, the receptive fields from spots, or the receptive fields from gratings. We also show the average response when moving 50° from the basis in 10,000 random directions. The results have been normalized such that the response in all conditions would be 1.0 if the neuron were linear. These results demonstrate that the optimal stimulus for these nonlinear neurons is determined by the feed-forward weights (i.e., the basis). This optimal stimulus is not represented by either the receptive fields from spots or the receptive fields from gratings. The hyperselectivity created by sparse coding significantly reduces the response away from this optimal stimulus.
Figure 8
 
(a) The misestimate in finding the receptive field of a neuron in a sparse coding network when mapped with spots (pixels) and gratings. The figure shows visually how far the receptive-field estimates are from the true optimal stimulus Smax. (b) The average angle between the three vectors (basis, receptive field from spots, and receptive field from gratings shown in Figure 7). (c) The relative response of each neuron to stimuli that match either the basis, the receptive fields from spots, or the receptive fields from gratings. We also show the average response when moving 50° from the basis in 10,000 random directions. The results have been normalized such that the response in all conditions would be 1.0 if the neuron were linear. These results demonstrate that the optimal stimulus for these nonlinear neurons is determined by the feed-forward weights (i.e., the basis). This optimal stimulus is not represented by either the receptive fields from spots or the receptive fields from gratings. The hyperselectivity created by sparse coding significantly reduces the response away from this optimal stimulus.
We make the argument that with this exo-origin curvature, the optimal stimulus remains the stimulus defined by the feed-forward weights (the basis). Figure 8c provides an analysis of the relative response magnitudes of these nonlinear neurons to three types of stimuli. The left bar represents the response of the neuron to a stimulus that matches the feed-forward weights (the basis), where the response has been normalized to a value of 1.0. The second bar represents the average response to a stimulus that corresponds to the receptive field measured with spots. The third bar represents the average response to a stimulus that corresponds to the receptive field measured with gratings. We have also measured the response of these nonlinear neurons to 10,000 stimuli near the basis (10,000 directions in image state space). We find that in all directions, when we are 50° away from the bases the response is reduced. These results imply that the largest response is to the feed-forward basis, with a lower response in every direction away from the basis. That is, in the sparse coding network the optimal response is determined by the feed-forward weights. To the extent that this network models the nonlinear responses of real neurons, it implies that the receptive fields that are generated by measuring the response to an orthonormal basis set (e.g., gratings, spots) will produce a significant error in estimating the optimal stimulus for a neuron. 
Comparing the sparse coding network to physiology
To compare the results of a sparse coding network to known physiological data, we measured the ratio of the bandwidth measured with gratings to the bandwidth predicted from the response to spots. Figure 9 shows the results of the sparse coding network for three levels of overcompleteness. For the 1.3-times overcomplete network there is only a small difference between the bandwidths measured with gratings and spots. The ratio of bandwidths is 1.25. That is, this network produces only a small amount of curvature (hyperselectivity), and therefore the receptive fields do not differ much with respect to the feed-forward basis. As the overcompleteness increases, the ratio of the bandwidths also increases. The rightmost bar in Figure 9 provides an estimate of this ratio found with recordings in V1 neurons. These data are derived from figure 1A of Tadmor and Tolhurst (1989). In that study, the responses of 33 V1 neurons were measured for both bars and gratings. Of those 33 neurons, 24 produced measurable bandwidths, and the researchers plotted the bandwidths as measured with gratings to the bandwidths that were predicted from the responses to the bars. From a visual inspection of their figure, we calculated the mean ratio between these two measures. This produced a ratio of roughly 2.0. Comparing these results to the sparse coding network suggests that this will correspond to a level of overcompleteness of approximately 2.3. 
Figure 9
 
To allow a comparison between the nonlinearities in a sparse coding network and the nonlinearities in V1 neurons, we have plotted the results of the network in terms of the ratio of bandwidths calculated from the response to gratings and the Fourier transform of the response to spots. As the network becomes more overcomplete, the response to gratings becomes narrower while the response to spots becomes broader. For example, with a 3.9-times overcomplete network, the bandwidths derived from spots 3.6 times broader than the bandwidths derived from gratings. On the far right, we show the ratio of bandwidths estimated from figure 1A of Tadmor and Tolhurst (1989). See text for details.
Figure 9
 
To allow a comparison between the nonlinearities in a sparse coding network and the nonlinearities in V1 neurons, we have plotted the results of the network in terms of the ratio of bandwidths calculated from the response to gratings and the Fourier transform of the response to spots. As the network becomes more overcomplete, the response to gratings becomes narrower while the response to spots becomes broader. For example, with a 3.9-times overcomplete network, the bandwidths derived from spots 3.6 times broader than the bandwidths derived from gratings. On the far right, we show the ratio of bandwidths estimated from figure 1A of Tadmor and Tolhurst (1989). See text for details.
This appears to contradict the current estimates of the degree of overcompleteness in V1, where estimates are in the range of 100:1 (Leuba & Kraftsik, 1994). However, this is not a fair comparison. First, nine of the 33 neurons in the Tadmor and Tolhurst (1989) study did not produce measurable bandwidths, and we believe this underestimates the bandwidth difference. Second, actual V1 neurons respond to more than two dimensions of space. They can also be selective to temporal frequency (e.g., motion), color, and disparity. The different dimensions are likely to be multiplicative in terms of overcompleteness. If space, color, time, and disparity are each 3-times overcomplete, then the full system will be 3 × 3 × 3 × 3 = 81-times overcomplete. For our simulations with static, gray, level images, the difference between receptive fields measured with spots and gratings corresponds to about a 3-times overcomplete system. This estimate seems low to us, and does not consider other types of nonlinearities (e.g., complex cells with endo-origin curvature), but it is an interesting approach to estimating the degree of overcompleteness in a system. 
Tadmor and Tolhurst attempted to model this nonlinear behavior with a high-threshold model of the neuron. They found that a high threshold did produce some of this effect; however, they also noted that the magnitude of the threshold was often required to be unrealistically high. It is worth noting the difference between our curvature and the high-threshold model. With our curvature, the response of the neuron near the neuron's optimal response is largely unchanged. As the curvature is increased, the responses away from the optimal stimulus (Smax) are reduced (i.e., they require more stimulus contrast). The curvature is, in a sense, a model of a variable threshold, with the lowest thresholds near the optimal stimulus and increased thresholds away from that optimal stimulus. Therefore, if the neuron is probed with a stimulus that is near the optimal response for the neuron, then the neuron will show a lower stimulus threshold. 
Tolhurst and Heeger (1997) showed that a model with gain control can produce a more effective model of this behavior. Indeed, we have argued (Golden et al., 2016) that this is achieved because their gain-control model (e.g., divisive normalization) curves the iso-response surfaces (see also Zetzsche & Nuding, 2005). 
Curvature and the Gabor limit
The results in the previous sections demonstrate that the neurons generated with the sparse coding algorithm produced relatively narrow tuning when measured with gratings compared to what was expected from the results measured with spots. This implies that the size and spatial-frequency bandwidths for these neurons can both be relatively small. This may appear to be in opposition to the fundamental limits imposed by the Gabor–Heisenberg uncertainty principle. Here we want to note that both the receptive fields in sparse coding networks and actual V1 neurons can break the apparent limit, and we argue that this is a fundamental component of sensory processing. 
In the 1970s, there was considerable debate regarding why a neuron would be selective to spatial frequency. This had an apparent resolution when Marĉelja (1980) introduced Gabor's (1946) article to the field. It was argued that the V1 simple cells showed a number of similarities to the units of information proposed by Gabor. Gabor had noted that there was a fundamental limit on the width and bandwidth of a function. The trade-off between selectivity to location and spatial frequency was derived from the original uncertainty principle (Heisenberg, 1927), which was directed at the limitations in identifying the location and momentum of a particle. The function that was optimally located in space and frequency was the Gaussian modulated sinusoid, or what we now call a Gabor function. The Gabor function has been shown to be a reasonably good fit to V1 receptive fields (e.g., Field & Tolhurst, 1986; Jones & Palmer, 1987). 
This Gabor limit is often considered to be a fundamental limit to localization. However, nonlinear neurons that base their response on both the input and the activity of surrounding neurons are not faced with this limitation. We argue that sensory neurons can and often do break this alleged localization limit. For a simple linear receptive field, the Gabor limit holds. However, the curvature we have described creates a form of soft winner-take-all. In the original uncertainty principle, the more precisely the position of some particle is determined, the less precisely its momentum can be known, and vice versa. This limitation follows because the particle can be detected just once. If it could be detected multiple times with different detectors without affecting the properties of the particle, then the Heisenberg limit could be exceeded. 
Consider the following extreme example. We have a detector that responds to only one pixel in an image. If any other pixel is nonzero, the detector is shut off. In such a case, the detector would respond to only those cases where the image consists of that single pixel on. The detector has precise location and identity information. One can play this game with any orthogonal basis, such that the detector fires only when it has activity with no other activity. In terms of our hyperselective functions, the response of each neuron is determined by the multiple detectors looking at the same stimulus. With nonlinear interactions between neighbors allowed, the output of each neuron can go below the localization limits described by Gabor. 
Figure 10 compares the localization of our linear and nonlinear receptive fields in relation to the Gabor limit. Figures 10a and 10b show the calculations that we performed on a sample of receptive fields. In Figure 10a we show the feed-forward weights (the basis) in space and frequency. For this linear receptive field, the Gabor limit should hold. We calculated the size of the receptive field by fitting a 2-D Gabor function to the spatial response (response to spots), and we calculated the bandwidth by fitting a 2-D Gaussian to the frequency response (response to gratings). This produces four measures (two in space and two in frequency). We then multiply these measures to determine the localization factor (see Equation 13). The argument from Gabor suggests that the localization factor should not fall below 1/4π2 (referred to as the Gabor limit).  
\begin{equation}\tag{13}{\rm{Localization\ Factor}} = \Delta X*\Delta Y*\Delta U*\Delta V\end{equation}
 
Figure 10
 
The hyperselectivity created by the curvature can produce neurons that are more localized than predicted from the Gabor limit. We show the results for a 2.6-times and a 3.9-times overcomplete sparse coding network (e.g., Figure 7). For each neuron in the network we measured the width in space (ΔX and ΔY) and the width in frequency (ΔU and ΔV). For a Gabor function, the product of these widths \(\Delta X*\Delta Y*\Delta U*\Delta V\) will be 1/4π2. We call this product the localization factor and plot the results for each neuron in the network in (c–d). For each neuron, we plot the localization factor for the feed-forward basis (cyan) and the localization factor following nonlinear interactions that produce hyperselectivity (orange). The nonlinear interactions allow the localization factor to fall below the Gabor limit. The dotted lines show the mean localization factor for the linear and the nonlinear conditions. The triangle and square in (c) represent the neuron as depicted in (a–b).
Figure 10
 
The hyperselectivity created by the curvature can produce neurons that are more localized than predicted from the Gabor limit. We show the results for a 2.6-times and a 3.9-times overcomplete sparse coding network (e.g., Figure 7). For each neuron in the network we measured the width in space (ΔX and ΔY) and the width in frequency (ΔU and ΔV). For a Gabor function, the product of these widths \(\Delta X*\Delta Y*\Delta U*\Delta V\) will be 1/4π2. We call this product the localization factor and plot the results for each neuron in the network in (c–d). For each neuron, we plot the localization factor for the feed-forward basis (cyan) and the localization factor following nonlinear interactions that produce hyperselectivity (orange). The nonlinear interactions allow the localization factor to fall below the Gabor limit. The dotted lines show the mean localization factor for the linear and the nonlinear conditions. The triangle and square in (c) represent the neuron as depicted in (a–b).
For each neuron in our network we also calculated the localization factor for the neurons after the nonlinear interactions produced by the sparse coding network. Figure 10b shows the receptive field and spatial-frequency response for the same neuron shown in Figure 10a. It is apparent that the neuron is more localized in both space and frequency. The localization factor for this neuron has dropped from 0.045 (triangle) to 0.011 (square), which is below the Gabor limit (1/4π2 = 0.025). 
Figures 10c and 10d show these calculations for each neuron in a 2.6- and a 4.9-times overcomplete network. The blue dots in each figure represent the localization factors measured for the feed-forward basis. Since these functions are linear, the Gabor limit should hold. The black line shows this Gabor limit. If these receptive fields were perfect Gabor functions, then their localization factors should fall on the Gabor-limit line. Since the majority are not perfect Gabor functions, they fall above this line. The orange dots represent the localization factor for each neuron after the nonlinear interactions produced by the sparse coding network. It can be seen that the majority of neurons now fall below the Gabor limit. It is also apparent that when the network is more overcomplete (Figure 10d), the mean localization factor (red line) drops further below the Gabor limit. 
The curvature caused by the neighborhood interactions allows these neurons to provide information about both space and frequency. The hyperselectivity created by these interactions allow neurons to be selective in multiple domains. However, we do not believe that the general goal of this hyperselectivity is to provide more precise information about spatial frequency or position. Rather, it is to provide a sparse efficient representation of images. The curvature around the primary basis serves to remove the redundancy created with an overcomplete code. It allows the network to isolate causes that are not orthogonal to each other. The main conclusion from this section is that the Gabor limit is not a fundamental limit on the localization of nonlinear sensory neurons. 
Discussion
In this article, we have argued that it is important to distinguish between two forms of selectivity. The first (class selectivity) is represented by the optimal stimulus for a neuron. The second (hyperselectivity) is represented by the falloff in response (the curvature) around this optimal stimulus. Our primary goal in this article is to make a clear distinction between these two ideas of selectivity and describe some of the implications of this distinction. 
In line with our previous work (Field & Wu, 2004; Golden et al., 2016), we have argued that overcomplete sparse coding will produce a curvature in the response surfaces of neurons. This provides an efficient approach to representing data when there are more descriptors than dimensions (e.g., more neurons than pixels). These ideas of understanding nonlinearities in terms of response curvature can be traced back to the work of Zetzsche and colleagues (Zetzsche et al., 1999; Zetzsche & Nuding, 2005; Zetzsche & Rohrbein, 2001). That work has argued that well-known nonlinearities in the early visual system are geometrically represented by a curvature in the response surface. 
In this article, we have contrasted four approaches to modeling these nonlinearities. In Figure 4, we showed examples of the curvature produced by classic models of sparse coding, gain control, and a particular example of a linear-nonlinear model. Each of these approaches can curve the iso-response surfaces, but they do it in different ways. The form of the curvature is important in understanding how these different approaches represent the data. Gain control with divisive normalization has curvature which becomes narrower with increasing response magnitude. This was not found to be the case with a cascaded linear-nonlinear model or with the sparse coding model in low dimensions, but as noted by Golden et al. (2016), this may depend on the cost function of the sparse coding model. As we noted, we are not arguing for one particular form of curvature. Each of these has advantages, and although we lean toward the fan equation, we do not believe there is sufficient physiological evidence to distinguish between these models. 
We noted one particular result of this curvature. The receptive field as measured with an orthonormal basis depends on the particular basis used to measure the receptive field. The sparse coding network of Olshausen and Field (1996) produces a curvature that alters the apparent receptive field. As noted by Olshausen and Field (1997), inhibition from the neighboring neurons produces a smaller receptive field than the feed-forward weights. Similarly, the spatial-frequency and orientation bandwidths are narrower than predicted from the feed-forward weights. In Figure 8, we measured the angular difference between these receptive fields. We found that the receptive fields as measured with spots differed from the feed-forward inputs by 40°, while the receptive field as measured from gratings was 50° from the feed-forward weights. The receptive fields as measured with either gratings or spots did not describe the optimal stimulus for the neuron. The neuron produced a significantly larger response to a stimulus aligned with the feed-forward inputs (Figure 8). These data provide quantitative examples of the differences between the receptive field as measured with a particular orthonormal basis set and the optimal stimulus (Smax) for a neuron. The analysis we describe here probes the hyperselectivity in a simple unsupervised network (sparse coding). We are confident that this geometric approach to hyperselectivity will prove useful for analyzing multilayer hierarchical (deep) networks as well. Currently (Golden, Vilankar, & Field, 2017), we are investigating networks which learn both invariance and selectivity (e.g., Karklin & Lewicki, 2005) and are planning to explore the units of supervised deep networks. Visualization of the hidden layers of a deep network is a major topic of research, and it is well known that a single receptive field is insufficient to describe any given unit's response. It is not yet clear what sort of curvature is produced by the nonlinearities typical of these networks (e.g., max pooling). However, the selectivity and tolerance shown by these networks imply to us that hyperselective curvature (exo-origin) and invariant curvature (endo-origin) play an important contribution to their success. 
Finally, in the last section we demonstrated that this curvature results in neurons that are more localized than one might expect from the Gabor–Heisenberg limitations. This result may appear to be either fundamental or trivial, depending on one's perspective. A similar debate arose from an article showing that human observers can identify the frequency of a tone and localize that tone better than the Gabor limit (Oppenheim & Magnasco, 2013). Although this result received considerable press at the time, the authors note that it simply rules out a simple linear model where single linear neurons are used to make the computations. Here we have shown that the curvature produced by sparse coding allows individual neurons to break this limit. However, they achieve this by effectively comparing the responses to neighboring neurons. The early work with modeling V1 simple cells as Gabor functions (e.g., Daugman, 1990; Field & Tolhurst, 1986; Marĉelja, 1980; Palmer, 1999) pointed to this limit as a possible account of why these neurons had the shape of a Gaussian modulated sinusoid. Here we have shown that the nonlinearities inherent in the early visual system allow neurons in the visual system to exceed this lower limit. We argue that this provides a means to allow an efficient overcomplete representation. 
We have described this exo-origin curvature as hyperselectivity (Golden et al., 2016). A closely related concept is described by Tsai and Cox (2015), who use the term “advanced selectivity” to describe their particular version of hyperselectivity. In their measure, a series of hypothetical stimuli with constant root-mean-square contrast is presented to a neuron (or to a unit in a deep network). As one moves away from the optimal stimulus for the neuron, a linear neuron will fall off at a particular rate. A neuron with advanced selectivity will fall off at a faster rate. This is very similar to our proposal; however, there are differences. By their measure, a simple output nonlinearity can produce advanced selectivity (see Supplementary Materials). If the slope of the output is higher than that of a linear neuron, then the falloff will be faster than a linear neuron's. We believe that the curvature in the iso-response surface produces a better description of the selectivity. However, it should be noted than in both accounts, a neuron that produces less curvature at low contrasts than at higher contrasts (e.g., the gain-control model shown in Figure 4c) requires a more complex account than would be provided by a single parameter. 
As we have noted throughout this article, many of the ideas presented here have origins in the work of Zetzsche and colleagues. That work first proposed to consider early nonlinearities in terms of a curved response space. Much of the work here is considering the implications of that curved space. In that light, we should make a comparison between the ideas of hyperselectivity and the “intrinsic two-dimensional” neurons discussed in their work (e.g., Barth, Zetzsche, & Rentschler, 1998). They have noted that the hyperselectivity produced from this curved space can act like an AND process and allows selectivity to more complex features like corners and junctions. The AND-like process allows the neuron to be selective to the combination of features without responding to the components. This creates a form of selectivity that is more intrinsic to the 2-D structure of the stimulus than can be produced with a linear filter. Barth et al. (1998) have shown that such nonlinear filters can be a useful way of extracting higher level statistics of a scene. In this article, we have focused on the properties of V1 neurons and have looked at the implications of this for neural tuning. These neurons respond to one-dimensional stimuli (e.g., edges) but are more selective to these edges than can be expected from a simple neuron with planar nonlinearities. Although these neurons are not clearly intrinsic two-dimensional neurons, there is a very strong relationship to Zetzsche's proposal. In reference to Figure 6, we can say that the neuron is responding to primary neuron (center) AND NOT if there is significant response from the neighboring neurons. 
In this article, we have not focused on the issue of invariance or tolerance. Neurons in area V1 and beyond show both hyperselectivity and varying degrees of tolerance and invariance (e.g., Rust & DiCarlo, 2010). As noted by Golden et al. (2016), invariance and tolerance are represented by curvature toward the origin (endo-origin curvature). It is possible to have neurons that have both endo- and exo-origin curvature. This allows a neuron to be invariant or tolerant to some features and selective to others. Sparse coding will produce only hyperselectivity. To produce tolerance, a different form of network is required. A wide range of possible architectures may allow for this. We are currently exploring the behavior of various architectures that produce tolerance (e.g., Karklin & Lewicki, 2005; Krizhevsky, Sutskever, & Hinton, 2012). The cascaded linear-nonlinear network discussed earlier (Pagan et al., 2016) can produce both endo- (x2 + y2) and exo-origin (x2 − y2) curvatures. It can also produce a combination (e.g., x2 + y2z2). We feel that this approach as currently conceived has a number of limitations (e.g., the forms of curvature and the tiling that are produced). However, modifications of this approach may well provide a means of developing efficient hierarchical networks. 
We wish to finish this discussion with a general comment regarding the primary goal of the hyperselective curvature. If this curvature also creates gain control, end stopping, basis-dependent bandwidths, and efficient overcomplete representations, how do we argue that one function is primary? Carandini and Heeger (2012) and Schwartz and Simoncelli (2001) suggest that gain control might be primary, with other features secondary. Certainly, if all of these behaviors are a result of curvature, then the problem is entangled. However, we argue that the primary goal of this hyperselectivity is not to produce gain control. In the sparse coding algorithm, there is no limit on the gain of the neuron; the neurons do not saturate. Rather, the curvature is a solution to the overcomplete representation. In the sparse coding network, the directions of the vectors (feed-forward weights) are optimized by the network and selected on the basis of the statistics of the inputs. However, curvature in this network depends primarily on the angle between these vectors. 
So can we argue that all forms of hyperselective curvature are due to the angle between neighboring neurons? This would be an interesting experimental test in recordings of V1 neurons. If we measure the receptive fields of a small population of neurons in V1, can the nonlinearities be predicted by the angles between neighboring neurons? This may be true in V1 but not at higher levels. If the network does not require that the image be reconstructed accurately (e.g., the only goal is to classify), then the curvature there need not be tied to the angles between neighboring neurons. The cascaded linear-nonlinear model of Pagan et al. (2016) allows the curvature in the network to be learned independently of the neuron's optimal stimulus (Smax). The network also allows hyperselectivity (exo-origin curvature) as well as tolerance (endo-origin curvature). Since the goal of this network is to classify, it is not clear what sets of curvatures will be optimized by the network. It is also not clear what forms of curvature would be optimal (e.g., gain control versus the fan equation). At some point, we hope that physiology will produce a clear insight into the curvature produced in the mammalian visual system. As we mentioned earlier, the methods with the most promise are those that use the spike triggered covariance technique (e.g., Rust et al., 2004, 2005; Schwartz et al., 2002; Schwartz et al., 2006; Vintch et al., 2015). These techniques can reveal the relevant subspace where the nonlinear interactions between neighboring neurons are occurring. By recording from multiple neurons, it may be possible to determine if the nonlinearities can be predicted by the angles between neighboring neurons. The difficulty with this approach is that an accurate description of the subspace requires a very large number of stimuli. Vintch et al. (2015) have provided some of the most interesting recent results. However, we think that to get an accurate account of curvature, the inhibitory subspaces must be probed with a denser array of stimuli. Ideally, once the interesting subspace has been determined with spike-triggered covariance, a new set of stimuli can be created that focuses on that subspace. 
Summary
We have shown in this paper that by comparing the geometry of iso-response surfaces, it is possible to compare and contrast four different models of V1 nonlinear behavior. Each of these models warps the iso-response contours to produce a hyperselective response. However, each model uses a different form of curvature. We have argued that the exo-origin curvature in these models produces a form of hyperselectivity that results in a number of interesting behaviors. It allows neurons the be highly selective to a broadband stimulus. It produces receptive fields and spatial frequency tuning that are not Fourier transforms of each other. Finally, the hyperselectivity allows these neurons to break through the Gabor limit. In this study we focused on only the two-dimensional cross-sections of the subspace between two interacting neurons. We are currently investigating the higher dimensional curvatures and the behavior of networks that produce both selectivity and tolerance/invariance (Golden et al., 2017). We believe that the low-dimensional components of curvature can help illuminate the behavior of these complex networks. 
Acknowledgments
This material is based upon work supported by, or in part by, a Google Faculty Research Award to DJF and a Dallenbach Fellowship to KPV. 
Commercial relationships: none. 
Corresponding author: David J. Field 
Address: Department of Psychology, Cornell University, Ithaca, NY, USA. 
References
Barlow, H. B. (1953). Summation and inhibition in the frog's retina. The Journal of Physiology, 119 (1), 69–88.
Barth, E., Zetzsche, C., & Rentschler, I. (1998). Intrinsic two-dimensional features as textons. Journal of the Optical Society of America A, 15 (7), 1723–1732.
Carandini, M., & Heeger, D. J. (2012). Normalization as a canonical neural computation. Nature Reviews Neuroscience, 13 (1), 51–62.
Chen, X., Han, F., Poo, M.-M., & Dan, Y. (2007). Excitatory and suppressive receptive field subunits in awake monkey primary visual cortex (V1). Proceedings of the National Academy of Sciences, USA, 104 (48), 19120–19125.
Daugman, J. G. (1985). Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. Journal of the Optical Society of America A, 2 (7), 1160–1169.
Daugman, J. G. (1990). An information-theoretic view of analog representation in striate cortex. Computational Neuroscience, 2, 403–423.
David, S. V., & Gallant, J. L. (2005). Predicting neuronal responses during natural vision. Network: Computation in Neural Systems, 16 (2–3), 239–260.
De Boer, E., & Kuyper, P. (1968). Triggered correlation. IEEE Transactions on Biomedical Engineering, BME-15 (3), 169–179.
De Valois, R. L., Albrecht, D. G., & Thorell, L. G. (1978). Cortical cells: Bar and edge detectors, or spatial frequency filters? In Cool S. J. & Smith E. L. (Eds.), Frontiers in visual science (pp. 544–556). Berlin, Heidelberg: Springer.
De Valois, R. L., Albrecht, D. G., & Thorell, L. G. (1982). Spatial frequency selectivity of cells in macaque visual cortex. Vision Research, 22 (5), 545–559.
De Valois, R. L., Albrecht, D. G., & Thorell, L. G. (1985). Periodicity of striate-cortex-cell receptive fields. Journal of the Optical Society of America A, 2 (7), 1115–1123.
Field, D. J. (1994). What is the goal of sensory coding? Neural Computation, 6 (4), 559–601.
Field, D. J., & Tolhurst, D. J. (1986). The structure and symmetry of simple-cell receptive-field profiles in the cat's visual cortex. Proceedings of the Royal Society of London. Series B: Biological Sciences, 228 (1253), 379–400.
Field, D. J., & Wu, M. (2004). An attempt towards a unified account of non-linearities in visual neurons [Abstract]. Journal of Vision, 4 (8): 283, doi:10.1167/4.8.283. [Abstract]
Gabor, D. (1946). Theory of communication. Part 1: The analysis of information. Journal of the Institution of Electrical Engineers—Part III: Radio and Communication Engineering, 93 (26), 429–441.
Geisler, W. S., & Albrecht, D. G. (1992). Cortical neurons: Isolation of contrast gain control. Vision Research, 32 (8), 1409–1410.
Golden, J. R., Vilankar, K. P., & Field, D. J. (2017). Curvature of neural response surfaces in a multi-layer network. Manuscript in preparation.
Golden, J. R., Vilankar, K. P., Wu, M. C., & Field, D. J. (2016). Conjectures regarding the nonlinear geometry of visual neurons. Vision Research, 120, 74–92.
Hartline, H. K., Wagner, H. G., & Ratliff, F. (1956). Inhibition in the eye of limulus. The Journal of General Physiology, 39 (5), 651–673.
Heeger, D. J. (1992). Normalization of cell responses in cat striate cortex. Visual Neuroscience, 9, 181–197.
Heisenberg, W. (1927). Über den anschaulichen Inhalt der quantentheoretischen Kinematik und Mechanik. Zeitschrift für Physik, 43 (3–4), 172–198.
Hubel, D. H., & Wiesel, T. N. (1962). Receptive fields, binocular interaction, and functional architecture in the cat's visual cortex. The Journal of Physiology, 160, 106–154.
Jones, J. P., & Palmer, L. A. (1987). An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. Journal of Neurophysiology, 58 (6), 1233–1258.
Karklin, Y., & Lewicki, M. S. (2005). A hierarchical Bayesian model for learning nonlinear statistical regularities in nonstationary natural signals. Neural Computation, 17 (2), 397–423.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Pereira, F. Burges, C. J. C. Bottou, L. & Weinberger K. Q. (Eds.), Advances in neural information processing systems (pp. 1097–1105). Red Hook, NY: Curran Associates.
Kuffler, S. W. (1953). Discharge patterns and functional organization of mammalian retina. Journal of Neurophysiology, 16, 37–68.
Leuba, G., & Kraftsik, R. (1994). Changes in volume, surface estimate, three-dimensional shape and total number of neurons of the human primary visual cortex from midgestation until old age. Anatomy and Embryology, 190 (4), 351–366.
Mante, V., Frazor, R. A., Bonin, V., Geisler, W. S., & Carandini, M. (2005). Independence of luminance and contrast in natural scenes and in the early visual system. Nature Neuroscience, 8 (12), 1690–1697.
Marĉelja, S. (1980). Mathematical description of the responses of simple cortical cells. Journal of the Optical Society of America, 70 (11), 1297–1300.
Mély, D. A., & Serre, T. (2016). Towards a unified model of classical and extra-classical receptive fields. In Zhao, Q. (Ed.), Computational and cognitive neuroscience of vision (pp. 59–84). Singapore: Springer.
Mély, D. A., & Serre, T. (2017). Towards a theory of computation in the visual cortex. In Zhao Q. (Ed.), Computational and cognitive neuroscience of vision (pp. 59–84). Singapore: Springer.
Murray, R. F. (2011). Classification images: A review. Journal of Vision, 11 (5): 2, 1–25, doi:10.1167/11.5.2. [PubMed] [Article]
Nestares, O., & Heeger, D. J. (1997). Modeling the apparent frequency-specific suppression in simple cell responses. Vision Research, 37 (11), 1535–1543.
Olshausen, B. A. (2013). Highly overcomplete sparse coding. In Proceedings of SPIE, 8651, 86510S.
Olshausen, B. A., & Field, D. J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381 (6583), 607–609.
Olshausen, B. A., & Field, D. J. (1997). Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision Research, 37, 3311–3325.
Olshausen, B. A., & Field, D. J. (2005). How close are we to understanding V1? Neural Computation, 17 (8), 1665–1699.
Oppenheim, J. N., & Magnasco, M. O. (2013). Human time-frequency acuity beats the Fourier uncertainty principle. Physical Review Letters, 110 (4), 044301.
Pagan, M., Simoncelli, E. P., & Rust, N. C. (2016). Neural quadratic discriminant analysis: Nonlinear decoding with V1-like computation. Neural Computation, 28, 2291–2319.
Palmer, S. E. (1999). Vision science: From photons to phenomenology. Cambridge, MA: MIT Press.
Prenger, R., Wu, M. C.-K., David, S. V., & Gallant, J. L. (2004). Nonlinear V1 responses to natural scenes revealed by neural network analysis. Neural Networks, 17 (5), 663–679.
Rabinowitz, N. C., Willmore, B. D., Schnupp, J. W., & King, A. J. (2011). Contrast gain control in auditory cortex. Neuron, 70 (6), 1178–1191.
Ringach, D., & Shapley, R. (2004). Reverse correlation in neurophysiology. Cognitive Science, 28 (2), 147–166.
Rust, N. C., & DiCarlo, J. J. (2010). Selectivity and tolerance (invariance) both increase as visual information propagates from cortical area V4 to IT. The Journal of Neuroscience, 30 (39), 12978–12995.
Rust, N. C., Schwartz, O., Movshon, J. A., & Simoncelli, E. (2004). Spike-triggered characterization of excitatory and suppressive stimulus dimensions in monkey V1. Neurocomputing, 58, 793–799.
Rust, N. C., Schwartz, O., Movshon, J. A., & Simoncelli, E. P. (2005). Spatiotemporal elements of macaque V1 receptive fields. Neuron, 46 (6), 945–956.
Sachs, M. B., Nachmias, J., & Robson, J. G. (1971). Spatial-frequency channels in human vision. Journal of the Optical Society of America, 61, 1176–1186.
Schwartz, O., Chichilnisky, E., & Simoncelli, E. P. (2002). Characterizing neural gain control using spike-triggered covariance. In Dietterich T. G., Becker S., & Ghahramani Z. (Eds.), Advances in neural information processing systems (pp. 269–276). Cambridge, MA: MIT Press.
Schwartz, O., Pillow, J. W., Rust, N. C., & Simoncelli, E. P. (2006). Spike-triggered neural characterization. Journal of Vision, 6 (4): 13, 484–507, doi:10.1167/6.4.13. [PubMed] [Article]
Schwartz, O., & Simoncelli, E. P. (2001). Natural signal statistics and sensory gain control. Nature Neuroscience, 4, 819–825.
Sharpee, T. O. (2013). Computational identification of receptive fields. Annual Review of Neuroscience, 36, 103–120.
Tadmor, Y., & Tolhurst, D. (1989). The effect of threshold on the relationship between the receptive-field profile and the spatial-frequency tuning curve in simple cells of the cat's striate cortex. Visual Neuroscience, 3 (5), 445–454.
Tolhurst, D., & Heeger, D. (1997). Comparison of contrast-normalization and threshold models of the responses of simple cells in cat striate cortex. Visual Neuroscience, 14 (2), 293–309.
Tsai, C.-Y., & Cox, D. D. (2015). Measuring and understanding sensory representations within deep networds: Using a numerical optimization framework. Available at http://arxiv.org/abs/1502.04972.
Vintch, B., Movshon, J. A., & Simoncelli, E. P. (2015). A convolutional subunit model for neuronal responses in macaque V1. The Journal of Neuroscience, 35 (44), 14829–14841.
Vintch, B., Zaharia, A., Movshon, J., & Simoncelli, E. P. (2012). Efficient and direct estimation of a neural subunit model for sensory coding. In Dietterich T. G., Becker S., & Ghahramani Z. (Eds.), Advances in neural information processing systems (pp. 3104–3112). Cambridge, MA: MIT Press.
Yamins, D. L., Hong, H., Cadieu, C. F., Solomon, E. A., Seibert, D., & DiCarlo, J. J. (2014). Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings of the National Academy of Sciences, USA, 111 (23), 8619–8624.
Zetzsche, C., & Barth, E. (1990). Fundamental limits of linear filters in the visual processing of two-dimensional signals. Vision Research, 30 (7), 1111–1117.
Zetzsche, C., Krieger, G., & Wegmann, B. (1999). The atoms of vision: Cartesian or polar? Journal of the Optical Society of America A, 16 (7), 1554–1565.
Zetzsche, C., & Nuding, U. (2005). Nonlinear and higher-order approaches to the encoding of natural scenes. Network: Computation in Neural Systems, 16 (2–3), 191–221.
Zetzsche, C., & Rohrbein, F. (2001). Nonlinear and extra-classical receptive field properties and the statistics of natural scenes. Network: Computation in Neural Systems, 12 (3), 331–350.
Zhu, M., & Rozell, C. J. (2013). Visual nonclassical receptive field effects emerge from sparse coding in a dynamical system. PLoS Computational Biology, 9 (8), e1003191.
Figure 1
 
In this article we will often refer to the angle between two neurons. We are referring to the angle in image state space. The figure shows four pairs of neurons with their spatial-orientation selectivity difference and the angle between them in 2-D subspace defined by the two neurons in image state space. The angle represents the overlap between the two neurons' receptive fields \((r\vec f1\,{\rm{and}}\,r\vec f2)\), calculated as shown in Equation 2. (a) An example of two neurons with receptive fields at different positions and not overlapping. Such neurons are orthogonal, as represented by two vectors on the right. (b) The receptive fields are overlapping, which produces an angle that is less than 90°. (c–d) Examples of pairs of neurons with overlapping receptive fields but different orientation selectivity. Although the orientation difference is the same in the two cases, the angles between the two neurons in image state space are different. Despite the overlap, the pair of neurons in (d) are nearly orthogonal, as represented on the right.
Figure 1
 
In this article we will often refer to the angle between two neurons. We are referring to the angle in image state space. The figure shows four pairs of neurons with their spatial-orientation selectivity difference and the angle between them in 2-D subspace defined by the two neurons in image state space. The angle represents the overlap between the two neurons' receptive fields \((r\vec f1\,{\rm{and}}\,r\vec f2)\), calculated as shown in Equation 2. (a) An example of two neurons with receptive fields at different positions and not overlapping. Such neurons are orthogonal, as represented by two vectors on the right. (b) The receptive fields are overlapping, which produces an angle that is less than 90°. (c–d) Examples of pairs of neurons with overlapping receptive fields but different orientation selectivity. Although the orientation difference is the same in the two cases, the angles between the two neurons in image state space are different. Despite the overlap, the pair of neurons in (d) are nearly orthogonal, as represented on the right.
Figure 2
 
An example of a neuron that would classically be described as narrowly tuned. Such a neuron is well localized in the frequency domain. In this article, we argue that if a neuron is linear, then it is as selective as any other linear neuron. See text for details.
Figure 2
 
An example of a neuron that would classically be described as narrowly tuned. Such a neuron is well localized in the frequency domain. In this article, we argue that if a neuron is linear, then it is as selective as any other linear neuron. See text for details.
Figure 3
 
Four basic types of neural-response geometry in a 2-dimensional state space. On the left, the neuron is represented as a vector ([1, 0]) in the 2-dimensional state space. The two dimensions in these figures represent two stimulus directions: D1 represents the direction of the preferred stimulus of the neuron, and D2 represents an orthogonal direction (e.g., Figure 1). For a real neuron this can be considered to be a two-dimensional slice through the high-dimensional response space. Each of the colored lines (orthogonal or curved) is an iso-response contour which represents a set of stimuli in the image state space for which the given neuron responds with the same magnitude. The plots on the right show the response surface, where the z-axis represents response magnitude of the neuron. (a) Response geometry of a linear neuron. (b) Response geometry of a thresholded nonlinear neuron. (c) Response geometry of a compressive nonlinear neuron. (d) Response geometry of a warping nonlinear neuron. This figure is modified from that found in Golden et al., 2016.
Figure 3
 
Four basic types of neural-response geometry in a 2-dimensional state space. On the left, the neuron is represented as a vector ([1, 0]) in the 2-dimensional state space. The two dimensions in these figures represent two stimulus directions: D1 represents the direction of the preferred stimulus of the neuron, and D2 represents an orthogonal direction (e.g., Figure 1). For a real neuron this can be considered to be a two-dimensional slice through the high-dimensional response space. Each of the colored lines (orthogonal or curved) is an iso-response contour which represents a set of stimuli in the image state space for which the given neuron responds with the same magnitude. The plots on the right show the response surface, where the z-axis represents response magnitude of the neuron. (a) Response geometry of a linear neuron. (b) Response geometry of a thresholded nonlinear neuron. (c) Response geometry of a compressive nonlinear neuron. (d) Response geometry of a warping nonlinear neuron. This figure is modified from that found in Golden et al., 2016.
Figure 4
 
The types of curvature produced by four models of V1 nonlinearities. Each approach can produce hyperselectivity of variable magnitude. For each model, we plot the iso-response contours in two dimensions. We show these contours for a single neuron along with the contours for two neurons (second column: orthogonal; third column: nonorthogonal). The four approaches are sparse coding (a–c), the fan equation (d–f), gain control (g–i), and a cascaded linear-nonlinear model (j–l). For the sparse coding and fan-equation models, the curvature depends on the angle between neighboring neurons (angle in image state space). If the neighbors are orthogonal, there is likely to be no or little curvature. For gain control, the curvature depends on whether the neighboring neuron is part of the group involved in divisive normalization. This can produce curvature even in cases where the neurons are orthogonal. Each of these approaches curves the iso-response contours differently. More critically, the grid of the iso-response contours will cover the image space in different ways for each of these models.
Figure 4
 
The types of curvature produced by four models of V1 nonlinearities. Each approach can produce hyperselectivity of variable magnitude. For each model, we plot the iso-response contours in two dimensions. We show these contours for a single neuron along with the contours for two neurons (second column: orthogonal; third column: nonorthogonal). The four approaches are sparse coding (a–c), the fan equation (d–f), gain control (g–i), and a cascaded linear-nonlinear model (j–l). For the sparse coding and fan-equation models, the curvature depends on the angle between neighboring neurons (angle in image state space). If the neighbors are orthogonal, there is likely to be no or little curvature. For gain control, the curvature depends on whether the neighboring neuron is part of the group involved in divisive normalization. This can produce curvature even in cases where the neurons are orthogonal. Each of these approaches curves the iso-response contours differently. More critically, the grid of the iso-response contours will cover the image space in different ways for each of these models.
Figure 5
 
(a, d) Receptive fields learned from a 1.3-times and a 13-times overcomplete sparse coding network, respectively. Classically, these receptive fields provide the primary way of describing the outputs of these networks. The 13-times overcomplete network, for example, produces more complex receptive fields that include plaids, spots, and curves. However, we argue that the nonlinearities in these codes change in consistent ways as the network becomes overcomplete. In particular, the overcomplete networks become more hyperselective. In a critically sampled code, the majority of neurons will be nearly orthogonal with their neighbors. In such a case, there is little curvature. (b) An example of the iso-response contours when the neighbors are orthogonal. In an overcomplete network there are more neurons than dimensions (e.g., pixels). This forces the angles between many neurons to be less than 90°. (c) The curvature in 2-D space when there are four neurons representing that space. (e) The curvature changes as the sparse coding network become more overcomplete. For this figure, we trained a sparse coding network on 8 × 8 natural-scene image patches. We varied the overcompleteness of the network from 1.3 times (e.g., Olshausen & Field, 1996) to 13 times. We then measured the curvature for the 2-D subspace defined between any pair of neurons in the network. For all of these networks, the majority of pairs will be orthogonal. We therefore measured the curvature for only the five neurons with the most overlap for each neuron in the network (i.e., the five neurons with the smallest angle in the image space). See text for details. (e) The average curvature as a function of overcompleteness. The figure also shows the average smallest angle of these five closest bases for each basis as function of overcompleteness. As the network becomes more overcomplete, the curvature between neighbors increases (i.e., the network becomes more hyperselective).
Figure 5
 
(a, d) Receptive fields learned from a 1.3-times and a 13-times overcomplete sparse coding network, respectively. Classically, these receptive fields provide the primary way of describing the outputs of these networks. The 13-times overcomplete network, for example, produces more complex receptive fields that include plaids, spots, and curves. However, we argue that the nonlinearities in these codes change in consistent ways as the network becomes overcomplete. In particular, the overcomplete networks become more hyperselective. In a critically sampled code, the majority of neurons will be nearly orthogonal with their neighbors. In such a case, there is little curvature. (b) An example of the iso-response contours when the neighbors are orthogonal. In an overcomplete network there are more neurons than dimensions (e.g., pixels). This forces the angles between many neurons to be less than 90°. (c) The curvature in 2-D space when there are four neurons representing that space. (e) The curvature changes as the sparse coding network become more overcomplete. For this figure, we trained a sparse coding network on 8 × 8 natural-scene image patches. We varied the overcompleteness of the network from 1.3 times (e.g., Olshausen & Field, 1996) to 13 times. We then measured the curvature for the 2-D subspace defined between any pair of neurons in the network. For all of these networks, the majority of pairs will be orthogonal. We therefore measured the curvature for only the five neurons with the most overlap for each neuron in the network (i.e., the five neurons with the smallest angle in the image space). See text for details. (e) The average curvature as a function of overcompleteness. The figure also shows the average smallest angle of these five closest bases for each basis as function of overcompleteness. As the network becomes more overcomplete, the curvature between neighbors increases (i.e., the network becomes more hyperselective).
Figure 6
 
Hyperselectivity can create a neuron that is narrowly tuned to a broadband stimulus. (a) The receptive field of a neuron that would be classically defined as broadband. For example, such a neuron would respond to a range of orientations. The orientation tuning is shown by the red curve in (b) and (c). The green curves in (b) and (c) show the effects of the nonlinear inhibition by two neighbors with similar orientations. The response of the neuron was modeled by using the fan equation (although any of the models described in Figure 4 would produce this effect). (e–f) The curvature that is produced with these neighbors. With this curvature, the optimal stimulus is unchanged; however, it responds less to nearby orientations. If the neuron is mapped with stimuli of different orientations, then it will appear to be narrowband. However, its preferred stimulus has not changed. The curvature produced by these neighboring neurons' interaction allows the neuron to be highly selective to this broadband stimulus—that is, its optimal stimulus Smax is still the broadband Gabor function shown in (a).
Figure 6
 
Hyperselectivity can create a neuron that is narrowly tuned to a broadband stimulus. (a) The receptive field of a neuron that would be classically defined as broadband. For example, such a neuron would respond to a range of orientations. The orientation tuning is shown by the red curve in (b) and (c). The green curves in (b) and (c) show the effects of the nonlinear inhibition by two neighbors with similar orientations. The response of the neuron was modeled by using the fan equation (although any of the models described in Figure 4 would produce this effect). (e–f) The curvature that is produced with these neighbors. With this curvature, the optimal stimulus is unchanged; however, it responds less to nearby orientations. If the neuron is mapped with stimuli of different orientations, then it will appear to be narrowband. However, its preferred stimulus has not changed. The curvature produced by these neighboring neurons' interaction allows the neuron to be highly selective to this broadband stimulus—that is, its optimal stimulus Smax is still the broadband Gabor function shown in (a).
Figure 7
 
(a) The basis functions (feed-forward weights) learned using a 16 × 16 sparse coding network on natural-scene images. (b) The receptive fields as response profile mapped using spots. (c) The receptive fields reconstructed from the inverse Fourier transform of the frequency response (the response to gratings). The receptive fields from spots are consistently smaller than the basis functions, and the receptive fields from gratings are consistently larger than the basis functions (i.e., the bandwidths are narrower).
Figure 7
 
(a) The basis functions (feed-forward weights) learned using a 16 × 16 sparse coding network on natural-scene images. (b) The receptive fields as response profile mapped using spots. (c) The receptive fields reconstructed from the inverse Fourier transform of the frequency response (the response to gratings). The receptive fields from spots are consistently smaller than the basis functions, and the receptive fields from gratings are consistently larger than the basis functions (i.e., the bandwidths are narrower).
Figure 8
 
(a) The misestimate in finding the receptive field of a neuron in a sparse coding network when mapped with spots (pixels) and gratings. The figure shows visually how far the receptive-field estimates are from the true optimal stimulus Smax. (b) The average angle between the three vectors (basis, receptive field from spots, and receptive field from gratings shown in Figure 7). (c) The relative response of each neuron to stimuli that match either the basis, the receptive fields from spots, or the receptive fields from gratings. We also show the average response when moving 50° from the basis in 10,000 random directions. The results have been normalized such that the response in all conditions would be 1.0 if the neuron were linear. These results demonstrate that the optimal stimulus for these nonlinear neurons is determined by the feed-forward weights (i.e., the basis). This optimal stimulus is not represented by either the receptive fields from spots or the receptive fields from gratings. The hyperselectivity created by sparse coding significantly reduces the response away from this optimal stimulus.
Figure 8
 
(a) The misestimate in finding the receptive field of a neuron in a sparse coding network when mapped with spots (pixels) and gratings. The figure shows visually how far the receptive-field estimates are from the true optimal stimulus Smax. (b) The average angle between the three vectors (basis, receptive field from spots, and receptive field from gratings shown in Figure 7). (c) The relative response of each neuron to stimuli that match either the basis, the receptive fields from spots, or the receptive fields from gratings. We also show the average response when moving 50° from the basis in 10,000 random directions. The results have been normalized such that the response in all conditions would be 1.0 if the neuron were linear. These results demonstrate that the optimal stimulus for these nonlinear neurons is determined by the feed-forward weights (i.e., the basis). This optimal stimulus is not represented by either the receptive fields from spots or the receptive fields from gratings. The hyperselectivity created by sparse coding significantly reduces the response away from this optimal stimulus.
Figure 9
 
To allow a comparison between the nonlinearities in a sparse coding network and the nonlinearities in V1 neurons, we have plotted the results of the network in terms of the ratio of bandwidths calculated from the response to gratings and the Fourier transform of the response to spots. As the network becomes more overcomplete, the response to gratings becomes narrower while the response to spots becomes broader. For example, with a 3.9-times overcomplete network, the bandwidths derived from spots 3.6 times broader than the bandwidths derived from gratings. On the far right, we show the ratio of bandwidths estimated from figure 1A of Tadmor and Tolhurst (1989). See text for details.
Figure 9
 
To allow a comparison between the nonlinearities in a sparse coding network and the nonlinearities in V1 neurons, we have plotted the results of the network in terms of the ratio of bandwidths calculated from the response to gratings and the Fourier transform of the response to spots. As the network becomes more overcomplete, the response to gratings becomes narrower while the response to spots becomes broader. For example, with a 3.9-times overcomplete network, the bandwidths derived from spots 3.6 times broader than the bandwidths derived from gratings. On the far right, we show the ratio of bandwidths estimated from figure 1A of Tadmor and Tolhurst (1989). See text for details.
Figure 10
 
The hyperselectivity created by the curvature can produce neurons that are more localized than predicted from the Gabor limit. We show the results for a 2.6-times and a 3.9-times overcomplete sparse coding network (e.g., Figure 7). For each neuron in the network we measured the width in space (ΔX and ΔY) and the width in frequency (ΔU and ΔV). For a Gabor function, the product of these widths \(\Delta X*\Delta Y*\Delta U*\Delta V\) will be 1/4π2. We call this product the localization factor and plot the results for each neuron in the network in (c–d). For each neuron, we plot the localization factor for the feed-forward basis (cyan) and the localization factor following nonlinear interactions that produce hyperselectivity (orange). The nonlinear interactions allow the localization factor to fall below the Gabor limit. The dotted lines show the mean localization factor for the linear and the nonlinear conditions. The triangle and square in (c) represent the neuron as depicted in (a–b).
Figure 10
 
The hyperselectivity created by the curvature can produce neurons that are more localized than predicted from the Gabor limit. We show the results for a 2.6-times and a 3.9-times overcomplete sparse coding network (e.g., Figure 7). For each neuron in the network we measured the width in space (ΔX and ΔY) and the width in frequency (ΔU and ΔV). For a Gabor function, the product of these widths \(\Delta X*\Delta Y*\Delta U*\Delta V\) will be 1/4π2. We call this product the localization factor and plot the results for each neuron in the network in (c–d). For each neuron, we plot the localization factor for the feed-forward basis (cyan) and the localization factor following nonlinear interactions that produce hyperselectivity (orange). The nonlinear interactions allow the localization factor to fall below the Gabor limit. The dotted lines show the mean localization factor for the linear and the nonlinear conditions. The triangle and square in (c) represent the neuron as depicted in (a–b).
Supplement 1
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×