**Characterization of the functional relationship between sensory inputs and neuronal or observers' perceptual responses is one of the fundamental goals of systems neuroscience and psychophysics. Conventional methods, such as reverse correlation and spike-triggered data analyses are limited in their ability to resolve complex and inherently nonlinear neuronal/perceptual processes because these methods require input stimuli to be Gaussian with a zero mean. Recent studies have shown that analyses based on a generalized linear model (GLM) do not require such specific input characteristics and have advantages over conventional methods. GLM, however, relies on iterative optimization algorithms and its calculation costs become very expensive when estimating the nonlinear parameters of a large-scale system using large volumes of data. In this paper, we introduce a new analytical method for identifying a nonlinear system without relying on iterative calculations and yet also not requiring any specific stimulus distribution. We demonstrate the results of numerical simulations, showing that our noniterative method is as accurate as GLM in estimating nonlinear parameters in many cases and outperforms conventional, spike-triggered data analyses. As an example of the application of our method to actual psychophysical data, we investigated how different spatiotemporal frequency channels interact in assessments of motion direction. The nonlinear interaction estimated by our method was consistent with findings from previous vision studies and supports the validity of our method for nonlinear system identification.**

*y*of the system of interest takes only binary states (either

*F*(

*) followed by a cumulative Gaussian, the sigmoid-type response function, as follows:*

**x***y*, given input

**x**is therefore described as follows:

*F(*can be estimated by solving simultaneous equations composed of moments, i.e., expectation of the products of inputs

**x**)**x**and or output

*y*(the detail derivations are described in Appendix 1). Namely, we obtain a relational expression between the vectors/matrix of expectations

**a**,

**b**, and

*M,*and the coefficients of the moment-generating function

**g**as:

**g**are described using the kernels of

*F*(

**x**) as follows:

*y*at either −1 or 1 without loss of generality and consider the case's ability to estimate the kernels of Volterra series up to second-order terms from data observed by repeated experiments.

*m*trials as

*X*consist of the input values of all stimulus dimensions and their binomial products from

*m*trials as

**a**and matrix

*M*are obtained by the following approximation:

*R*

^{2}= 0.964,

*p*< 0.01). The correlation between our method and ground truth was also significantly different from that between STA and ground truth, and between whitened STA and ground truth (

*p*< 0.01, Meng's Z-test). The estimated kernel values by our method deviated slightly from the ground truth due to the truncation error of the second-order approximation in the calculation although the noise variance of the model was matched. Figure 2e shows the mean squared errors (MSEs) of the estimations of STA, whitened STA, and our method after adjusting the scale of kernels to minimize each MSE. Our method shows significantly lower MSE over STA and whitened STA (

*t*test of 20 estimations using different noncentered stimulus distribution,

*p*< 0.01).

*R*

^{2}> 0.859,

*p*< 0.01). The estimated kernel values by our method deviated slightly from the ground truth due to the truncation error of the second-order approximation in the calculation.

*R*

^{2}= 0.829, when sample size was 16,000). The performance is comparable with that of GLM, although the calculation cost is much lower with our method. The results also show that our method outperforms STA for a wide range of kernel estimation. (Postprocess whitening did not affect the results of STA since we used Gaussian stimulus input in this simulation). Figure 5b is the plot of the correlation of the second-order kernels between the ground truth and estimation. Although GLM shows better estimates than our method in terms of accuracy, the difference is small; additionally, our method provides accurate kernel estimation (whose correlation is significantly high as

*R*

^{2}= 0.911, when sample size was 16,000).

*R*

^{2}= 0.924, which was statistically indistinguishable from the estimation without truncating using 10-fold larger data sets (

*p*< 0.01, Meng's Z-test). Figure 3f shows that our method with eigenvalue truncation provided significantly better estimation than GLM when sample size was 25,000 (

*t*test,

*p*< 0.001).

^{2}). Then, 500 ms after the button was pressed, the fixation point disappeared, leaving only the blank screen for 200 ms. Moving sinusoidal gratings were then displayed on the monitor for 200 ms. After a 200-ms blank period, a fixation point reappeared on the monitor prior to the button-press response that triggered the next trial. Observers were instructed to judge whether they perceived the gratings to be moving upward or downward, as a whole, and indicate their observations by pressing a 2AFC button.

*y*= 1, and “downward” reports as

*y*= −1. The contrast values of 144 gratings correspond to input vector

*of 144 dimensions, varying from 0 to 2.11 [cd/m*

**x**^{2}]. We denote the nonlinear model of observers' assessments of motion direction using frequency values instead of using the index number of moving gratings in order to make the first- and second-order kernels interpretable in the frequency space as follows:

*l*), first-order (

*H*, the second order contribution of input

**x**to the assessment of direction can be expressed as

*H*has an eigenvector

**v**of an eigenvalue

*, 2 (2), 284–299.*

*Journal of Optical Society of America A**, 25, 18–18.*

*Perception**, 3299, 79–85.*

*Proceedings of SPIE**, 203, 237–260.*

*Journal of Physiology**, 12, 199–213.*

*Network: Computation in Neural Systems**, 6 (4), 1040–1058.*

*SIAM Journal on Optimization**, 24 (31), 6991–7006.*

*The Journal of Neuroscience**, 69, 1091–1117.*

*Journal of Neurophysiology**, 69, 1118–1135.*

*Journal of Neurophysiology**, 29, 241–246.*

*Vision Research**, 17, 1057–1065.*

*Vision Research**.*

*Max-Planck-Institute Technical Report No. TR-114**, 17 (6), 858–865.*

*Nature Neuroscience**(pp. 119–133). Cambridge, MA: Bradford Books, MIT Press.*

*Computational models of visual processing**, 25, 1493–1500.*

*Vision Research**, 32, 47–59.*

*Vision Research**, 37, 1048–1049.*

*Zeitschrift für Naturforschung C**, 160, 106–154.*

*Journal of Physiology**, 58 (6), 1187–1211.*

*Journal of Neurophysiology**, 8( 16): 10, 1–19, doi:10.1167/8.16.10. [PubMed] [Article]*

*Journal of Vision**, 9( 7), e1003143, 1–18.*

*PLoS Computational Biology**, 135 (3), 370–384.*

*Journal of the Royal Statistical Society, Series A**, 4 (2): 2, 82–91, doi:10.1167/4.2.2. [PubMed] [Article]*

*Journal of Vision**, 17, 802–808.*

*Psychonomic Bulletin & Review**, 10 (01), 016014.*

*Journal of Neural Engineering**Presented at the ARVO annual meeting*, Ft. Lauderdale, FL, 1995.

*, 246 (4972), 1037–104.*

*Science**, 78, 2861–2869.*

*Nonlinear Dynamics**, 14, 437–464.*

*Network: Computation in Neural Systems**, 15, 243–262.*

*Network: Computation in Neural Systems**, 26, 2454–2462.*

*Advances in Neural Information Processing Systems**, 378 (6554), 281–284.*

*Nature**, 28, 147–166.*

*Cognitive Science**, 5, 517–548.*

*Network: Computation in Neural Systems**. New York: John Wiley & Sons.*

*The Volterra and Wiener theories of nonlinear systems**, 14, 269–276.*

*Advances in Neural Information Processing Systems**, 7 (12): 8, 1–14, doi:10.1167/7.12.8. [PubMed] [Article]*

*Journal of Vision**, 15, 277–284.*

*Advances in Neural Information Processing Systems**. Cambridge, MA: MIT Press.*

*The cognitive neuroscience*(3rd ed., pp. 327–338)*, 93 (2), 1074–1089.*

*Journal of Neurophysiology**, 287, 1273–1276.*

*Science**, 110 (461), 319–324.*

*Technical Report of IEICE**, 21, 1115–1122.*

*Vision Research**, 23, 873–882.*

*Vision Research**P(y,x),*or the joint probability distribution of input

**x**and output

*y,*is written as

*P(x)*, or the probability distribution of input x, is written as

*F(*(Eq. 1) and a moment-generating function of

**x**)*P(*(Eq. S3) into Eq. S4, we obtain,

**x**)*y*, are given by its partial derivatives as follows

**a**,

**b**, and

*M,*and the coefficients of the moment-generating function

**g**as shown in Eq. 5 in the main text:

*M*by the approximation using Eq. 7. If we set the input data matrix

*X*obtained by time

*t*as

*M*at time

*t*as

*t*+ 1 as

*t*and

*t*+ 1.

*t*is updated as

*t+1*. Thus, we can update the estimated kernel by substituting Eq. SS2 and Eq. SS3 into Eq. 8.

**g**that minimize the following equation:

**g**, regularized coefficients