An alternative possibility is “2D pooling
process” in which motion signals are first integrated across orientation over a small region, and then the resulting 2D local motion signals are globally integrated over space (
Figures 1c–
1e). The critical difference from the 1D pooling
process is that the aperture problem is solved locally before spatial pooling. In support of this hypothesis, recent electrophysiological findings suggest that the spatial range of cross-orientation integration of MT/V5 neurons is relatively small (Majaj, Carandini, & Movshon,
2007; Rust, Mante, Simoncelli, & Movshon,
2006). Theoretically, the 2D pooling
process can explain global-motion perception not only when local motion is conveyed by a 2D pattern such as a dot (i.e., 2D pooling
phenomenon), but also when local motions are conveyed by 1D pattern as in the components creating multi-aperture plaid stimuli (1D pooling
phenomenon). These possibilities are respectively called “2D by 2D” (
Figure 1e) and “1D by 2D” (
Figures 1c and
1d) hypotheses. As shown by
Figure 1, there are two versions of the “1D by 2D” hypothesis. One (“semi-local 1D pooling” hypothesis;
Figure 1c) is to compute 2D motion signals from local processing of adjacent 1D signals falling within a small spatial area for cross-orientation integration. The other (“orthogonal vector” hypothesis;
Figure 1d) is to use orthogonal vectors of 1D local motion as local 2D motion signals—when no other motion or form cues are available, a 1D pattern, e.g., a static Gabor with a moving carrier, is seen to move in the direction orthogonal to the carrier. This suggests that orthogonal motion is the default solution of the aperture problem for 1D patterns. A 2D pooling process could then treat these 1D signals (which still contain direction and speed information) in the same manner as (true) 2D signals.