Age-related macular degeneration (AMD) is the major cause of blindness in the developed world. Though substantial work has been done to characterize the disease, it is difficult to predict how the state of an individual's retina will ultimately affect their high-level perceptual function. In this paper, we describe an approach that couples retinal imaging with computational neural modeling of early visual processing to generate quantitative predictions of an individual's visual perception. Using a patient population with mild to moderate AMD, we show that we are able to accurately predict subject-specific psychometric performance by decoding simulated neurodynamics that are a function of scotomas derived from an individual's fundus image. On the population level, we find that our approach maps the disease on the retina to a representation that is a substantially better predictor of high-level perceptual performance than traditional clinical metrics such as drusen density and coverage. In summary, our work identifies possible new metrics for evaluating the efficacy of treatments for AMD at the level of the expected changes in high-level visual perception and, in general, typifies how computational neural models can be used as a framework to characterize the perceptual consequences of early visual pathologies.

*μ*m. The soft drusen, on the other hand, are found in earlier and late-stage macular degeneration, typically associated with pigmentary changes as disease progresses. The soft drusen exhibit yellow, fuzzy appearance on color fundus photographs, which are above 125

*μ*m or from 63 to 125

*μ*m with visible thickness (Klein et al., 1991).

*ρ*

^{AMD}(

*N*(4096) conductance-based integrate-and-fire point neurons (one compartment), representing about a 2 × 2 mm

^{2}piece of a V1 input layer (layer 4C). Our model of V1 consists of 75% excitatory neurons and 25% inhibitory neurons. Dynamic variables of each neuron are the membrane potential

*v*

_{ i }(

*t*) and spike train

_{ i }(

*t*) = ∑

_{ k }

*δ*(

*t*−

*t*

_{ i,k }), where

*t*is time and

*t*

_{ i,k }is the

*k*th spike of the

*i*th neuron,

*i*= 1, …,

*N*. Each neuron is modeled as

*g*

_{L,i },

*g*

_{E,i }, and

*g*

_{I,i }represent the leakage, excitatory, and inhibitory conductances of neuron

*i*.

_{ k }(

*E*) and

_{ k }(

*I*),

*k*= 0, 1, a population that receives LGN input (

*k*= 1) and one that does not (

*k*= 0). In the model, 30% of both the excitatory and inhibitory cell populations receive LGN input. Noise, cortical interactions, and LGN input are assumed to act additively in contributing to the total conductance of a cell:

*η*

_{ μ,i }(

*t*) represents a cell-specific external stochastic term representing synaptic noise for a cortical excitatory (

*μ*= E) or inhibitory (

*μ*= I) neuron (see Wielaard & Sajda, 2006b and Supplementary materials for details). The terms

*g*

_{ μ,i }

^{cor}(

*t,*[

_{ μ }) are the contributions from the cortical excitatory (

*μ*= E) and inhibitory (

*μ*= I) neurons and include only isotropic connections:

*i*∈

_{ k′}(

*μ*′). Here,

_{ i }is the spatial position (in cortex) of neuron

*i*, the functions

*G*

_{ μ, j }(

*τ*) describe the synaptic dynamics of cortical synapses, and the functions

_{ μ′,μ }

^{ k′,k }(

*r*) describe the cortical spatial couplings (cortical connections). The length scale of excitatory and inhibitory connections is about 200

*μ*m and 100

*μ*m, respectively.

*j*∈

_{1}(

*μ*), is connected to a set

*N*

_{L,j }

^{LGN}of left eye LGN cells or to a set

*N*

_{R,j }

^{LGN}of right eye LGN cells:

*Q*= L or R (i.e., left or right eye). Here, [

*x*]

_{+}=

*x*if

*x*≥ 0 and [

*x*]

_{+}= 0 if

*x*≤ 0,

_{ℓ}(

*r*) and

*G*

_{ℓ}

^{LGN}(

*τ*) are the spatial and temporal LGN kernels, respectively,

_{ℓ}is the receptive field center of the ℓth left or right eye LGN cell, which is connected to the

*j*th cortical cell, and

*I*(

*s*) is the visual stimulus. The parameters

*g*

_{ℓ}

^{0}represent the maintained activity of LGN cells and the parameters

*g*

_{ℓ}

^{ V }measure their responsiveness to visual stimuli. The binary mask

*ρ*

^{AMD}(

*I*(

*s*) at the location of scotoma, which acts multiplicatively on the input conductance to the V1 model.

*k*is a normalization constant,

*σ*

_{c,ℓ}and

*σ*

_{s,ℓ}are the center and surround sizes, respectively, and

*K*

_{ℓ}is the integrated surround–center sensitivity. The temporal kernels are normalized in Fourier space, ∫

_{−∞}

^{∞}∣

_{ℓ}

^{LGN}(

*ω*)∣

*dω*= 1,

_{ℓ}

^{LGN}(

*ω*) = (2

*π*)

^{−1}∫

_{−∞}

^{∞}

*G*

_{ℓ}

^{LGN}(

*t*)

*e*

^{−iωt }

*dt*. For the magnocellular architecture, the time constants

*τ*

_{1}= 2.5 ms,

*τ*

_{2}= 7.5 ms, and

*c*= (

*τ*

_{1}/

*τ*

_{2})

^{6}so that

_{ℓ}

^{LGN}(0) = 0, in agreement with the experiments (Benardete & Kaplan, 1999). For the parvocellular architecture, the time constants

*τ*

_{1}= 8 ms,

*τ*

_{2}= 9 ms, and

*c*= 0.7(

*τ*

_{1}/

*τ*

_{2})

^{5}. The delay times

*τ*

_{ℓ}

^{0}are taken from a uniform distribution between 20 ms and 30 ms, for all cases. Sizes for center and surround were taken from experimental data (Croner & Kaplan, 1995; Derrington & Lennie, 1984; Hicks, Lee, & Vidyasagar, 1983; Shapley, 1990; Spear, Moore, Kim, Xue, & Tumosa, 1984) and were

*σ*

_{c,ℓ}=

*σ*

_{c}= 0.1° (magno) and 0.04° (parvo) for centers and

*σ*

_{s,ℓ}=

*σ*

_{s}= 0.72° (magno) and 0.32° (parvo) for surrounds. The integrated surround–center sensitivity was in all cases

*K*

_{ℓ}= 0.55 (Croner & Kaplan, 1995). By design, no diversity has been introduced in the center and surround sizes in order to demonstrate the level of diversity resulting purely from the cortical interactions and the connection specificity between LGN cells and cortical cells (i.e., the sets

*N*

_{ Q,j }

^{LGN}, see specifications below). Furthermore, no distinction was made between ON-center and OFF-center LGN cells other than the sign reversal of their receptive fields (± sign in Equation 7). The LGN RF centers

_{ℓ}were organized on a square lattice with lattice constants

*σ*

_{c}/2. These lattice spacings and consequent LGN receptive field densities imply LGN cellular magnification factors that are in the range of the experimental data available for macaque (Conolly & van Essen, 1984; Malpeli, Lee, & Baker, 1996). The connection structure between LGN cells and cortical cells, given by the sets

*N*

_{ Q,j }

^{LGN}, is made so as to establish ocular dominance bands and a slight orientation preference that is organized in pinwheels (Blasdel, 1992). It is further constructed under the constraint that the LGN axonal arbor sizes in V1 do not exceed the anatomically established values of 1.2 mm for magnocellular and 0.6 mm for parvocellular neurons (Blasdel & Lund, 1983; Freund, Martin, Soltesz, Somogyl, & Whitteridge, 1989).

- Parameters related to the integrate-and-fire mechanism, such as threshold, reset voltage, and leakage conductance. These are identical for all cells (Equation 2).
- The cortical interaction strengths and connectivity length scales. These are presented by the functions$C$
_{ μ′,μ }^{ k′,k }(*r*) that are not cell specific but only specific with respect to the four cell populations. Note that the functions$C$_{ μ′,μ }^{ k′,k }(*r*) are also not configuration specific (Equation 4). - Maintained activity and responsiveness to visual stimulation of LGN cells (Equation 5).
- Receptive field sizes of LGN cells. These are neither cell nor population specific (i.e., where “population” in this case refers to the ON and OFF LGN cell populations) but are only specific with respect to the four model configurations, i.e., receptive field sizes of all LGN cells are identical for a particular configuration (Equation 7).

- The external noisy conductances
*η*_{E,i }(*t*) (excitatory) and*η*_{I,i }(*t*) (inhibitory) (Equation 3). - The LGN connectivity to our model cortex as described by
*N*_{L,j }^{LGN}and*N*_{R,j }^{LGN}(Equation 5).

*i*, in the population, for each trial

*k*, as

*s*

_{ i,k }(

*t*) = ∑

_{ l }

*δ*(

*t*−

*t*

_{ i,k,l }), where

*t*∈ [0, 250] ms,

*i*= 1 …

*N*is the index for neurons,

*k*= 1 …

*M*is the index for trials, and

*l*= 1 …

*P*is the index for spikes. Based on the population spike trains, we estimated the firing rate on each trial by counting the number of spikes within a time bin of width

*τ,*resulting in a “spike count matrix”

*r*

_{(i,j,k)}= ∫

_{(j−1)τ+1}

^{ jτ }

*s*

_{ i,k }(

*t*)

*dt,*where

*i*= 1 …

*N*is the index for neurons,

*j*= 1 …

*T*/

*τ*is the index for time bin, and

*k*= 1 …

*M*is the index for trials. When

*τ*= 25 ms, we are assuming that information is encoded in the temporal precision of the population activity since temporal precision is required so that the spike count matrix does not, from trial to trial, change substantially by having spikes switch from one bin to another. When

*τ*= 250 ms, we integrate the spiking activity over the entire trial, leading to a rate-based representation of information.

*b*∈

^{ m }take the value of {−1, +1} (either face or car). We then compute the weighted sum over the population spike count matrix. For notational convenience, we replace the spike count matrix

*r*

_{ i,j,k }with the stacked matrix

*x*

_{ l,k }, where

*x*

_{ l,k }=

*r*

_{ i,j,k }, and

*l*= (

*i*− 1)

*n*+

*j,*which leads to the following constrained minimization problem:

*λ*> 0 is a regularization parameter that controls the sparsity of the decoder,

*w*∈

^{ n }specifies the weights, and

*v*∈

*θ*is the logistic loss function, defined by

*θ*(

*z*) = log(1 + exp(−

*z*)). Such a formulation essentially minimizes the average logistic loss defined by the first term in the minimization, with a Lagrange multiplier for the ℓ

_{1}norm of the weights—i.e., reduces the classification error while choosing as few elements in the spike count matrix as possible. The resultant linear decoder can be geometrically interpreted as a hyperplane defined by

*w*

^{T}

*X*+

*v*= 0, which separates the classes of face and car. We optimize Equation 8 using the hybrid iterative shrinkage (HIS) algorithm (Shi, Yin, Osher, & Sajda, 2010).

*K*-fold cross-validation (where

*K*= 10) was used on the training set, while the weights applied on the testing set were estimated using a jackknife estimation to eliminate the bias.

*K*-fold results of the model. Az can be seen as the probability of a correct decision by the decoder (Green & Swets, 1966). These data were then also fit using a Weibull function.

*L*s) obtained from these two conditions were transformed by

*λ*is distributed as

*χ*

^{2}with 2 degrees of freedom (Hoel et al., 1971). If

*λ*does not exceed the criterion value (for

*p*= 0.05), we conclude that we cannot reject the hypothesis that a single function fits the two data sets as well as two separate functions.

*r*(

*θ*) is the mean firing rate at orientation

*θ*∈ [0, 2

*π*]. CV is a measure of orientation selectivity, where a smaller value for CV indicates a greater orientation selectivity. When CV = 0, the neuron only responds to one orientation; while CV = 1, the neuron responds equally to all orientations and, hence, is not selective for orientation. Orientation selectivity is one of the fundamental properties of the early visual system and is a major element of form vision needed for object recognition and discrimination.

*p*≪ 0.05), and neurometric Az is an extremely good predictor for psychometric Az (Az

_{psycho}= 1.006 × Az

_{neuro}− 0.0647). We then compared the predictive value of our model to more conventional predictive measures based on direct analysis of the fundus image. To characterize the fundus images, we defined the drusen index (DI) as the fraction of drusen-free area on the fundus:

*ρ*

^{AMD}(

*α*. Proceedings of the National Academy of Sciences of the United States of America, 97, 8087–8092. [CrossRef] [PubMed]

_{1}-regularized logistic regression. Journal of Machine Learning Research, 11, 713–741.