**Receptive field identification is a vital problem in sensory neurophysiology and vision. Much research has been done in identifying the receptive fields of nonlinear neurons whose firing rate is determined by the nonlinear interactions of a small number of linear filters. Despite more advanced methods that have been proposed, spike-triggered covariance (STC) continues to be the most widely used method in such situations due to its simplicity and intuitiveness. Although the connection between STC and Wiener/Volterra kernels has often been mentioned in the literature, this relationship has never been explicitly derived. Here we derive this relationship and show that the STC matrix is actually a modified version of the second-order Wiener kernel, which incorporates the input autocorrelation and mixes first- and second-order dynamics. It is then shown how, with little modification of the STC method, the Wiener kernels may be obtained and, from them, the principal dynamic modes, a set of compact and efficient linear filters that essentially combine the spike-triggered average and STC matrix and generalize to systems with both continuous and point-process outputs. Finally, using Wiener theory, we show how these obtained filters may be corrected when they were estimated using correlated inputs. Our correction technique is shown to be superior to those commonly used in the literature for both correlated Gaussian images and natural images.**

*multiple*linear filters. For example, in the popular Adelson-Bergen energy model, the output is determined by the sum of the square of the outputs of two Gabor filters (Adelson & Bergen, 1985; see Figure 3B); thus, the STA of such a system is zero because it responds equally to positive and negative contrasts. In other less severe cases, the system may be approximated by the STA, but multiple filters would still be required to completely characterize it. The most popular method to obtain these multiple filters is the spike-triggered covariance (STC) method, which builds off the STA framework by taking the covariance rather than the mean of all spike-triggering stimuli. Although other more complex methods have been used to obtain these filters (Paninski, 2003; Pillow & Park, 2011; Rapela, Mendel, & Grzywacz, 2006; Saleem, Krapp, & Schultz, 2008; Sharpee, Rust, & Bialek, 2004), none have been as popular as the relatively straightforward and simple STC method, which has been used in diverse areas such as modeling Hodgkin-Huxley dynamics (Agüera y Arcas, Fairhall, & Bialek, 2001), vision (Fairhall et al., 2006; Rust, Schwartz, Movshon, & Simoncelli, 2005; Touryan, Felsen, & Dan, 2005), audition (Slee, Higgs, Fairhall, & Spain, 2005), insect movement (Fox, Fairhall, & Daniel, 2010; de Ruyter van Steveninck & Bialek, 1988), and olfaction (Kim, Lazar, & Slutskiy, 2011). An excellent review of both STA and STC for characterizing receptive fields appears in Schwartz, Pillow, Rust, and Simoncelli (2006).

*y*[

*t*] can be expressed in terms of the input,

*x*[

*t*], and a series of increasing-order nonlinear filters, known as the Volterra kernels,

*k*[

_{q}*τ*

_{1}…

*τ*], as Although, in this form, the Volterra series describes a dynamic system in which the input is temporal in nature (i.e.,

_{q}*x*[

*t*]), the Volterra framework can just as easily be applied to systems in which the input is spatial or spatiotemporal. Thus, the spatial version of Equation 1 for the first two terms is where in the context of receptive field identification,

*x*[

*n*,

*t*] is the

*n*th pixel of the input image at time

*t*. For the remainder of this section, we will assume a spatial system even though any intuitions may just as easily be applied to temporal systems.

*k*

_{1}[

*n*], can be seen as weighing the contribution of a single pixel of the input to the output. Thus, the first-order Volterra function describes a linear system, and the first-order Volterra kernel is equivalent to the well-known impulse response function in engineering and the STA in neuroscience (see Equation 6). The second-order Volterra kernel, and the first nonlinear kernel, describes second-order nonlinear interactions between two input pixels. Each value of the kernel (depicted as a surface), for instance,

*k*

_{2}[

*n*

_{1},

*n*

_{2}] at lags

*n*

_{1}and

*n*

_{2}, represents the weight (i.e., relative importance) of the product of pixel values

*x*[

*n*

_{1},

*t*]

*x*[

*n*

_{2},

*t*] for spatial systems or time lags

*x*[

*t*–

*n*

_{1}]

*x*[

*t*–

*n*

_{2}] for temporal systems in reconstructing the system output,

*y*[

*t*]. Thus, a large positive value of

*k*

_{2}[

*n*

_{1},

*n*

_{2}] implies strong mutual facilitation of the input-lagged values

*x*[

*t*–

*n*

_{1}] and

*x*[

*t*–

*n*

_{2}] in the way they affect

*y*[

*t*] (V. Z. Marmarelis, 2004). In vision research, second-order dynamics are commonly used to describe center-surround inhibition and phase insensitivity, and in temporal neural systems, they are commonly used to describe paired-pulse facilitation and depression (Krausz, 1975; Song, Marmarelis, & Berger, 2009). Likewise, the

*q*th-order kernel describes the nonlinear interactions between

*q*input pixels.

*q*th-order Wiener kernel was shown to be simply a scaled version of the

*q*th-order cross-correlation,

*y*

_{0}[

*t*] is the demeaned output,

*h*is the

_{q}*q*th-order estimated Wiener kernels, and

*σ*

^{2}is the power (variance) of the GWN input. In particular, the first two Wiener kernels are

*[*

_{xy}*n*], and, under a wide class of inputs, the first-order Wiener kernel,

*h*

_{1}(Equation 4; Rieke, Warland, de Ruyter van Steveninck, & Bialek, 1999). The critical step is to note that the sum of the spike-triggered stimuli

*y*[

*t*] will be zero, and thus, that stimulus will not influence the total sum. The derivation is as follows: where

*y*[

*t*].

*h*

_{2}, the relationship, to our knowledge, has never been explicitly defined (although see Bialek & de Ruyter van Steveninck, 2005; Park, Archer, Priebe, & Pillow, 2013). Here, we derive this relationship and show that for GWN inputs,

*h*

_{2}that intertwines first- and second-order dynamics and incorporates the input autocorrelation. We begin the derivation by noting that the covariance of vector

*μ*can be written as

*y*[

*t*] is separated into

*y*

_{0}[

*t*] +

*m*, giving Finally, by noting

_{y}*is the input autocorrelation matrix. Here, it can be seen that the STC matrix*

_{x}*M*

_{Φ}, incorporates the input autocorrelation, Φ

*. In the case of uncorrelated inputs, such as GWN, this modification adds a delta function down the diagonal of*

_{x}*h*

_{2}in Equation 4 will also be biased; see the section on correcting for correlated inputs). This modification as a source of bias has been acknowledged previously in the context of information theory and addressed by removing the “prior” covariance matrix or the input autocorrelation matrix, Φ

*(Agüera y Arcas, Fairhall, & Bialek, 2003). The second modification,*

_{x}*M*

_{1}, incorporates first-order dynamics through the term

*x*[

*t*] is not demeaned, then there will be an additional modification to the Wiener kernels, which causes a significant bias. It is defined as Note that this bias actually changes the shape of the STC matrix and the obtained linear filters, highlighting the absolute necessity of demeaning all stimuli when using STC or Wiener approaches. This is distinct from STA, with which not demeaning the input will only change the baseline of the obtained STA filter but not its actual shape.

*M*

_{1}modification, leaving only the

*M*

_{Φ}modification, which can easily be removed by subtracting the input autocorrelation. Thus, the unmodified Wiener kernel,

*h*

_{2}, can be obtained by where

*h*

_{2}. This finding, however, does not mean that the linear filters obtained from the STC matrix are biased. To begin the search for an efficient linear filter bank, we note that any arbitrary set of Wiener/Volterra functions can be represented as a Wiener-Bose cascade model composed of a set of linear filter banks followed by a multivariate static nonlinearity (see Figure 1; V.Z. Marmarelis, 2004; Westwick & Kearney, 2003). However, unlike the true Wiener kernels, which have a canonical form from which any deviation will lead to suboptimal predictive power, any Wiener system can be expressed by an infinite number of Wiener-Bose cascades so long as the linear filter banks span the same subspace. Thus, a desirable linear filter bank will be (a) highly predictive or essentially span the same subspace as the equivalent Wiener-Bose cascade, (b) compact or contain as few filters as necessary, and (c) interpretable or facilitate scientific discovery. Our group has attempted to find such a desirable filter bank over the last two decades in the context of dynamic system identification through the use of PDMs (V. Z. Marmarelis, 2004; V. Z. Marmarelis & Orme, 1993; V. Z. Marmarelis et al., 2014). The PDMs are a system-specific and input-independent set of linear filters that aim to compactly describe the dynamics of nonlinear systems and, thus, are functionally equivalent to the STC filters.

**Figure 1**

**Figure 1**

*t*: and

*h*

_{0}=

*m*is the 0th-order Wiener kernel. Now, as in standard STC analysis, the

_{y}*E*is the orthonormal matrix of the eigenvectors of

*L*significant eigenvectors are selected according to some criteria, and those eigenvectors, from the second element onward, can be considered the PDMs,

*c*is the first element of the

_{i}*i*th eigenvector. Note that the effect of the

*c*constant is to offset the parabolic static nonlinearity (see the section on associated nonlinear functions).

_{i}*L*is increased. Having established these theoretical underpinnings, other approaches that claim to estimate unbiased linear filters may be considered. Most notably, Samengo and Gollisch (2013) have demonstrated that if one defines the STC filters as the STA and the significant eigenvectors of the STC matrix, these STC filters will span the same linear subspace whether or not the STA is subtracted prior to calculating the STC matrix. Their proof demonstrates that even though modification

*M*

_{1}makes the STC matrix a biased Wiener kernel, the set of linear filters obtained from the STC matrix, provided the STA is included, is itself unbiased.

*h*

_{2}(or STC matrix). This is done by taking the significant singular vectors of the rectangular matrix defined as

*h*

_{2}eigenvalues and the [

*h*

_{1},

*h*

_{2}] singular values with

*T*= 40 bootstrap repetitions for each hypothesis test.

*: Touryan et al. (2005) proposed that the STC method could be corrected for correlated inputs by first “prewhitening” these inputs and then using the standard STC method. Later, Samengo and Gollisch (2013) showed that this was equivalent to multiplying the biased STC matrix with the input autocorrelation matrix as*

_{x}*amplifies noise in the input. Thus, the corrected kernel,*

_{x}*h*

_{2}, which is obtained by twice multiplying the biased kernel with

*F*[

*v*

_{1}(

*t*)…

*v*(

_{L}*t*)], where {

*v*

_{1}(

*t*)…

*v*(

_{L}*t*)} are the filter outputs (V. Z. Marmarelis, 2004). In the STC literature, the nonlinearity following these filters has traditionally been estimated using a histogram approach based on Bayes rule: where the above quotient is estimated by dividing each filter output into

*B*bins and calculating the proportion of times a spike was elicited when the filter output fell into a specific set of bins (V. Z. Marmarelis et al., 2013; Schwartz et al., 2006). An advantage of this approach is that because the obtained nonlinearity is not constrained to be a polynomial, it can describe many nonlinearities that would otherwise be difficult to characterize with a Volterra approach, such as a sigmoidal nonlinearity (Bartesaghi, Gessi, & Migliore, 1995). However, a major issue with this approach is that the dimension of the nonlinearity is equal to the number of filters. For example, even with three filters, estimating this nonlinearity becomes unfeasible due to the large number of parameters (

*B*

^{3}), and even if there was enough data to estimate this nonlinearity, it would be very difficult to visualize and interpret. Some researchers have attempted solve this issue by using a priori assumptions of the underlying physiological system to simplify the high dimensional multivariate nonlinearity into simpler forms (Bialek & de Ruyter van Steveninck, 2005; Rust et al., 2005). Alternatively, many researchers have attempted to get around this “curse of dimensionality” by assuming that the filter outputs are independent of each other, or, equivalently, that there are no nonlinear interactions between the

*L*filters. In such a situation, the output of each filter would be transformed independently by a one-dimensional static nonlinearity, which our group has previously dubbed the associated nonlinear function (ANF) (V. Z. Marmarelis, 2004). Thus, with this “independence assumption,” one would only need to estimate

*L*one-dimensional nonlinearities (ANFs), each with

*B*parameters, rather than one

*L*-dimensional nonlinearity with

*B*parameters.

^{L}*y*[

*t*], and the “independence assumption” is satisfied. If, however, the underlying system has higher-order nonlinearities and the PDMs are acquired from only the first- and second-order kernels (as currently done), then these PDMs may or may not have nonlinear interactions or “cross-terms,” which also contribute to the output, thus violating the “independence assumption.” Our group has developed methods for statistically testing whether such cross-terms exist, and, if they do, how one may characterize the system without resorting to a full multivariate nonlinearity (V. Z. Marmarelis, 2004; V. Z. Marmarelis et al., 2013). In this work, however, all of the prethreshold dynamics of the simulated systems are second order, and thus these methods will not be used. It should be noted that even though the threshold trigger in spiking systems technically makes the nonlinearity “infinite order,” it has been shown that the number of Wiener kernels needed to completely characterize such a system is at most the order of nonlinearity of the prethreshold dynamics (V. Z. Marmarelis, Citron, & Vivo, 1986).

*L*= 3 Laguerre basis functions (and for the second-order case, the outer products of these basis functions) (Ogura, 1985). Unless otherwise noted, the first- and second-order kernels were normalized so that their respective output power would be equal. The output was generated by feeding a random Gaussian input through this system and then thresholding the output. The first two thirds of the data was used for model estimation (training set), and the remaining third was used for model assessment (testing set).

*y*[

*t*]. Receiver operating characteristic (ROC) curves, which evaluate the performance of a binary classifier (Caruana & Niculescu-Mizil, 2004; Zanos et al., 2008), plot the true positive rate

*T*, such that The area under the curve (AUC) of ROC plots is used as a performance metric of the model and has been shown to be equivalent to the Mann-Whitney two-sample statistic (Hanley & McNeil, 1982). The AUC, which is obtained by numerically integrating the ROC curve, ranges from 0 to 1, with 0.5 indicating a random predictor and higher values indicating better model performance. The second metric used was the Pearson correlation coefficient,

*ρ*, between the estimated prethreshold output,

*y*[

*t*]. Unless otherwise mentioned, all model assessment metrics were evaluated on the testing set.

*h*

_{2}, and the STC matrix,

*h*

_{2}(shown in Figure 1), the addition of the input autocorrelation,

*M*

_{Φ}, which can be seen as the “delta function” down the matrix diagonal, and the intermixing of first- and second-order dynamics (

*M*

_{1}). These modifications prohibit any straightforward interpretation or prediction from the STC matrix and relegate it to being an intermediate step for obtaining the STC filters, which are predictive and interpretable.

**Figure 2**

**Figure 2**

**Figure 3**

**Figure 3**

*M*

_{1}modification was present. As can be seen, when the input is uncorrelated (“white”), this modification asymptotically has no effect on the obtained STC filters; when dealing with finite data records, however, it may induce negative artifacts. In Figure 2c, the

*M*

_{1}modification was removed, resulting in a different set of STC matrix eigenvectors. However, as shown by Samengo and Gollisch (2013), when the STA is included, these eigenvectors should span the same linear subspace as the linear filters obtained from the traditional STC matrix. Finally, in Figure 2d, both modifications were removed to obtain the estimated canonical Wiener kernel,

*h*

_{2}in Figure 1a). This is the only kernel whose elements can be interpreted as the weight assigned by pairwise interactions of the inputs. The PDMs, obtained by the SVD method, are shown in Figure 2dii. The method recovered a compact and efficient set of three linearly independent filters. Notice that although the PDMs and STC filters should theoretically span the same subspace, the obtained PDMs much more closely match the linear filters used to generate the output (Figure 1b). The projection of spike-triggering stimuli (blue dots) and non-spike-triggering stimuli (red dots) onto the first two PDMs can be further seen in Figure 2g. As can be seen, the first two PDMs are able to achieve a large degree of separability between the stimuli although inclusion of the third PDM is needed to achieve near perfect separability.

*only*second-order (quadratic) dynamics. Thus, the STA or first-order Wiener kernel, shown in Figure 3b, is zero. This means that the

*M*

_{1}modification present in the STC matrix is irrelevant, leaving only the input-autocorrelation modification,

*M*

_{Φ}. Note that here, unlike in the previous one-dimensional (temporal) example, visually examining the second-order kernel is meaningless because the 2-D (spatial) input was arbitrarily rearranged into a 1-D vector. However, interpretation of the second-order Wiener kernel may still be done by considering only specific parts of it, such as the diagonal (Citron & Marmarelis, 1987). Figure 3c and 3d shows the eigenvalue distribution, linear filters, and nonlinearities of the STC and PDM methods. As can be seen, both methods recover the ground truth filters very accurately. Furthermore, as can be presumed from the similarity of the obtained filters, both methods had roughly equal predictive power. This shows that in the case of uncorrelated inputs, the

*M*

_{Φ}modification has an asymptotically negligent effect on the recovered filters. As we shall see later, however, this is not the case for correlated inputs when the input autocorrelation is more complex than a delta function down the kernel diagonal.

*AUC*= .985 vs.

_{PDM}*AUC*= .98). Furthermore, the PDMs were much closer to the original rotation of the ground truth filters, thus facilitating interpretability (see Kaardal et al., 2013, and Pillow & Simoncelli, 2006, which explore this vital issue more deeply).

_{STC}**Figure 4**

**Figure 4**

*T*= 30 randomly generated systems with GWN inputs (see Methods, section on simulations and model assessment). As noted previously, all synthetic systems could be presented as a second-order Volterra system, or, alternatively, as a Wiener-Bose cascade consisting of three independent linear filters. In Figure 5a, the predictive power was measured as a function of the number of included filters. For the STC method, the first included filter was the STA. As can be seen, the PDMs modestly outperformed the STC filter set for any amount of included filters. Notably, the first PDM outperformed the STA filter, showing that even in a Wiener-Bose cascade model with a single linear filter, the STA is suboptimal. Furthermore, as suggested by the example in Figure 2, the predictive power of the STC filter set decreased with the inclusion of the third significant STC filter (fourth filter total) due to the linear dependence between the STA and STC filters. This occurs even though the fourth filter was deemed significant by the bootstrap method, demonstrating the need for cross-validation as essential for filter selection. Figure 5b compares the predictive power of the PDM and STC filter sets (with three filters). As can be seen, the PDMs outperformed the STC filters in almost every trial (

*ρ*= .843 vs.

_{PDM}*ρ*

_{STC}= .815,

*p*< 0.001).

**Figure 5**

**Figure 5**

*N*= 30 trials were conducted with randomly generated synthetic systems as described previously. For each synthetic system, the STC and PDM estimation methods were applied to randomly generated training/in-sample data of various lengths from 200 to 15,000 bins. Then the predictive performance of the estimated filters was evaluated on testing/out-of-sample data.

*ρ*metric. The convergence of the data is described by the shape of the testing set curve in Figure 6a, top, and, in particular, how quickly this curve approaches a steady-state value. In order to facilitate comparisons between the two methods, their predictive power was plotted as a percentage of their maximal value in Figure 6a, bottom. As can be seen, in noise-free conditions, it takes both methods only about 150 spikes to converge to more than 90% of their maximal value. The singular value distribution of the PDM method for a sample system is shown in Figure 6b. As shown by the red line, with only 200 output spikes, the PDM eigenvalues can be distinguished from the remaining eigenvalues.

**Figure 6**

**Figure 6**

*M*

_{Φ}, is not removed, which is a serious source of bias when using correlated inputs. These different filter sets will henceforth be referred to as filter set I, filter set II, and filter set III, respectively. Figure 7b through 7d shows the kernels, eigenvalue/singular value distribution, and obtained filters from all three methods. Note that the corrected Wiener kernel from Figure 7di obtained near perfect predictive power (as seen in Figure 7e), thus validating the correction formula from Equation 16. Note that all methods only showed two significant eigenvalues even though there are three ground truth filters. The reason for this is presumably because the third filter is inhibitory, and thus, the application of a relatively high binary threshold may have obstructed its recovery (V. Z. Marmarelis et al., 1986). As mentioned previously, quality of the obtained filter sets must be assessed by their predictive power, which quantifies how closely the obtained filter set spans the true filter space. As can be seen in the ROC plot in Figure 7f, filter set III has a significantly higher AUC value than the other two methods. The superiority of filter set III can also be seen by comparing it with the ground truth filters in Figure 7a. Its two filters match almost exactly the first two filters of the system. The two filters of filter sets I and II, on the other hand, are strongly biased versions of the first two ground truth filters. It should also be noted that recent work by Aljadeff, Segev, Berry, and Sharpee (2013) has proposed alternative methods on how to recover optimal linear subspaces when using highly correlated Gaussian inputs, which may further improve results here and possibly recover the third linear filter.

**Figure 7**

**Figure 7**

**Figure 8**

**Figure 8**

*M*

_{Φ}) and mixes first- and second-order dynamics (

*M*

_{1}). Furthermore, it is shown how one may obtain the Wiener kernels while preserving the simplicity and elegance of the STC method and with no additional computational expense, i.e., for free (see Equation 9).

*Journal of the Optical Society of America A*, 2 (2), 284–299.

*Biological Cybernetics*, 42 (2), 133–143.

*Advances in neural information processing systems 13: Proceedings of the 2000 conference, volume 13,*(p. 75). Cambridge, MA: MIT Press.

*Neural Computation*, 15 (8), 1715–1749.

*PLoS Computational Biology*, 9 (9), e1003206.

*Hippocampus*, 5 (5), 440–451.

*IEEE Transactions on Neural Systems and Rehabilitation Engineering*, 20 (2), 198–211.

*arXiv preprint q-bio/0505003*.

*PLoS Computational Biology*, 6 (10), e1000967.

*Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining*(pp. 69–78). ACM.

*Network: Computation in Neural Systems*, 12 (2), 199–213.

*Biological Cybernetics*, 57 (4–5), 241–247.

*crcns.org*, doi:10.6080/K0RN35SV.

*IEEE Transactions on Biomedical Engineering, (3)*, 169–179.

*Proceedings of the Royal Society of London. Series B*.

*Biological Sciences*, 234 (1277), 379–414.

*Hearing Research*, 66 (2), 177–201.

*Journal of Computational Neuroscience*, 34 (1), 163–183.

*J Neurophysiol*, 58 (1), 33–65.

*Journal of Neurophysiology*, 96 (5), 2724–2738.

*PLoS Computational Biology*, 7 (10), e1002249.

*The Journal of Neuroscience*, 34 (16), 5515–5528.

*Proceedings of the National Academy of Sciences, USA*, 107 (8), 3840–3845.

*Radiology*, 143 (1), 29–36.

*Journal of Neuroscience Methods*, 203 (1), 220–232.

*Neural Computation*, 25 (7), 1870–1890.

*Journal of Computational Neuroscience*, 30 (1), 143–161.

*IEEE Transactions on Acoustics, Speech and Signal Processing*, 33 (6), 1445–1455.

*Annals of Biomedical Engineering*, 16 (2), 201–214.

*Biological Cybernetics*, 19 (4), 217–230.

*International Journal of Control*2 (3), 237–254.

*Journal of Neurophysiology*, 36 (4), 605–618.

*Annals of Biomedical Engineering*, 21 (6), 573–589.

*Advanced methods of physiological system modeling*(pp. 229–242). New York: Plenum Press.

*Annals of Biomedical Engineering*, 25 (2), 239–251.

*Nonlinear dynamic modeling of physiological systems*. Hoboken, NJ: Wiley-Interscience.

*Mathematical Biosciences*, 196 (1), 1–13.

*Annals of Biomedical Engineering*, 27 (1), 23–31.

*Biological Cybernetics*, 54 (2), 115–123.

*Annals of Biomedical Engineering*, 27 (3), 391–402.

*IEEE Transactions on Biomedical Engineering*, 40 (11), 1149–1158.

*Journal of Computational Neuroscience*, 34 (1), 73–87.

*Journal of Computational Neuroscience*, 36 (3), 321–337.

*The Open Biomedical Engineering Journal*, 6, 42.

*15th NIBB Conference, Okazaki, Japan*(pp. 14–62), 1985.

*Network: Computation in Neural Systems*, 14 (3), 437–464.

*Advances in neural information processing systems*(pp. 2454–2462). NIPS Proceedings.

*Neuron*, 60 (5), 890–903.

*Advances in neural information processing systems*, (pp. 1692–1700). NIPS Proceedings.

*Nature*, 454 (7207), 995–999.

*PloS one*, 8 (11), e71959.

*Spikes: Exploring the Neural Code*. Cambridge, MA: MIT Press.

*Neuron*, 46 (6), 945–956.

*Journal of Computational Neuroscience*, 34 (1), 137–161.

*Neural Engineering (NER), 2013 6th International IEEE/EMBS Conference on*(pp. 1143–1146). IEEE.

*Journal of Computational Neuroscience*, 240, 1–15.

*Journal of Neuroscience Methods*, 240, 179–192.

*Nature Neuroscience*, 4 (8), 819–825.

*Neural Computation*, 16 (2), 223–250.

*The Journal of Neuroscience*, 25 (43), 9978–9988.

*IEEE Transactions on Biomedical Engineering*, 54 (6), 1053–1066.

*Journal of Computational Neuroscience*, 26 (1), 1–19.

*Journal of Computational Neuroscience*, 35 (3), 335–357.

*Network: Computation in Neural Systems*, 12 (3), 289–316.

*Neuron*, 45 (5), 781–791.

*Journal of Neurophysiology*, 93 (2), 1074–1089.

*Identification of nonlinear physiological systems*(pp. 57–101). Hoboken, NJ: John Wiley & Sons.

*Nonlinear problems in random theory*. Cambridge, MA: The MIT Press.

*IEEE Transactions on Neural Systems and Rehabilitation Engineering*, 16 (4), 336–352.

*IEEE Transactions on Biomedical Engineering*, 51 (2), 255–262.