Polynomials of degree 2 are closely related to but at the same time much more general than the functions corresponding to the neural networks used in standard studies in the field (Hyvärinen & Hoyer,
2000; Hyvärinen & Hoyer,
2001; Körding et al.,
2004; also see
Section 6.1). Those models usually rely either on linear networks, which lie in the space of polynomials of degree 1, or on networks with one layer of a fixed number of linear units (2 to 25) followed by a quadratic nonlinearity (
Figure 14b), which form a small subset of the space of polynomials of degree 2. This can be seen from the following considerations. Each polynomial of degree 2 can be written as an inhomogeneous quadratic form
, where
H is an
N×
N matrix,
f is an
N-dimensional vector, and
c is a constant. For example for
N = 2,
. As also noticed by Hashimoto (
2003), for each quadratic form there exists an equivalent two-layer neural network, which can be derived by rewriting the quadratic form using its eigenvector decomposition:
where
V is the matrix of the eigenvectors
vk of
H and
D is the diagonal matrix of the corresponding eigenvalues
μk, so that
VTHV=
D. One can thus define a neural network with a first layer formed by a set of
N linear subunits
followed by a quadratic nonlinearity weighted by the coefficients
μk/2. The output neuron sums the contribution of all subunits plus the output of a direct linear connection from the input layer (
Figure 14a). Because the eigenvalues can be negative, some of the subunits give an inhibitory contribution to the output. The weight vectors
vk are uniquely determined only if the eigenvalues of
H are all different; otherwise the decomposition of
H is arbitrary in the subspace that corresponds to the multiple eigenvalue. Moreover, the coefficients to the subunits are fixed only if one assumes that the weight vectors have unit norm.