We examined the dependence of BOLD response in visual areas on the amount of local motion in dynamic scenes, generated by moving forward through everyday environments. This is the most common type of motion experienced by humans and a motion pattern that MST appears to be specialized for (Duffy & Wurtz,
1991; Smith et al.,
2006). In the first experiment, we used a simple, biologically motivated, computational model to quantify the amount of local motion across each movie clip — the motion energy content (as calculated by spatiotemporal correlation) — and examined how the magnitude of the cortical response depends on motion signal content. The 2DMD model (Zanker, Srinivasan, & Egalhaaf,
1999) has been successful in analyzing similar forward natural motion (Zanker & Zeil,
2005) and simulating psychophysical results (Zanker,
1997,
2004; Zanker & Braddick,
1999; Zanker, Hermens, & Walker,
2010). Motion correlation is formally equivalent to motion energy (Adelson & Bergen,
1985) under a range of conditions (Hildreth & Koch,
1987) and so we do not expect our results to be specific to the model used. We also tested the hypothesis that beyond the local motion output, familiarity may also exert a top-down influence; inverting the contrast polarity of the scenes reduces familiarity while preserving motion characteristics. Finally, we produced novel stimuli that used the same clips and manipulated how much visual motion information was presented. To this end, we manipulated the proportion of the visible area by masking these clips with an opaque gray layer (similar to Kane, Bex, & Dakin,
2011) and varying the number of hard-edged circular apertures through which they could be seen. This restricted all visual information, also resulting in a decrease in the amount of available motion signals, the attribute for which MT and MST are selective. We varied the number of apertures, the size of the apertures, the spatial arrangements of visible local motion, and the overall image contrast and also tested response to counter-phase flickering images. In this way, we ascertained how human visual response varies with the amount of dynamic information, i.e., coherent or incoherent motion, or flicker, in natural scenes.