Abstract
Neural responses to visual inputs change continuously over time. Even for simple static stimuli, responses in visual cortex decrease when stimulus duration is prolonged (subadditive temporal summation), reduce when stimuli are repeated (adaptation), and rise more slowly for low contrast stimuli (phase delay). These phenomena are often studied independently. Here, we demonstrate these phenomena within the same experiment and model the underlying neural computations with a single computational model. We extracted time-varying responses from electrocorticographic (ECoG) recordings from patients presented with grayscale pattern stimuli that varied in contrast, duration, and inter-stimulus interval (ISI). Aggregating data across patients yielded 88 electrodes with robust visual responses, covering earlier (V1-V3) and higher-order (V3a/b, LO, TO, IPS) retinotopic maps. In all regions, the ECoG responses exhibit several nonlinear dynamics: peak response amplitude saturates with high contrast and longer stimulus durations; the response to a second stimulus is suppressed for short ISIs and recovers for longer ISIs; response latency decreases with increasing contrast. These dynamics are accurately predicted by a computational model comprised of a small set of canonical neuronal operations: linear filtering, rectification, exponentiation, and delayed divisive normalization. We find that an increased normalization term captures both adaptation- and contrast-related response reductions, suggesting potentially shared underlying mechanisms. We additionally demonstrate both changes and invariance in temporal dynamics across the visual hierarchy. First, temporal summation windows increase systematically from earlier to higher areas; however, recovery time from adaptation is relatively invariant. Second, response amplitudes become more invariant to contrast in higher visual areas, but response latencies do not. Together, our results reveal the presence of a wide range of temporal neuronal dynamics in the human visual cortex, and demonstrate that a simple model captures these dynamics at millisecond resolution.