Abstract
In human visual motion processing, image motion is first detected by one-dimensional (1D), spatially local, direction-selective neural sensors. Each sensor is tuned to a given combination of position, orientation, spatial frequency and feature type (e.g., first-order and second-order). To recover the true 2-dimensional (2D) and global direction of moving objects (i.e., to solve the aperture problem), the visual system integrates motion signals across orientation, across space and possibly across the other dimensions. We investigated this multi-dimensional motion integration process, using global motion stimuli comprised of numerous randomly-oriented Gabor (1D) or Plaid (2D) elements (for the purpose of examining integration across space, orientation and spatial frequency), as well as diamond-shape Gabor quartets that underwent rigid global circular translation (for the purpose of examining integration across spatial frequency and signal type). We found that the visual system adaptively switches between two spatial integration strategies — spatial pooling of 1D motion signals and spatial pooling of 2D motion signals — depending on the ambiguity of local motion signals. MEG showed correlated neural activities in hMT+ for both 1D pooling and 2D pooling. Our data also suggest that the visual system can integrate 1D motion signals of different spatial frequencies and different feature types, but only when form conditions (e.g., contour continuity) support grouping of local motions. These findings indicate that motion integration is a complex and smart computation, and presumably this is why we can properly estimate motion flows in a wide variety of natural scenes.