The effects of configuration on stereopsis at detection threshold have most commonly been explained by a form of disparity pooling or averaging (McKee,
1983; Mitchison & Westheimer,
1984; Vreven, McKee, & Verghese,
2002). While the neural processing underlying such disparity-based pooling is typically unspecified, a simple hierarchical model is often assumed. That is, binocular neurons in later visual processing areas extract more global shape information by pooling information from disparity selective neurons in V1. Disparity pooling or averaging may also be responsible for the results presented here and in our previous experiments (Deas & Wilcox,
2014). However, our results are not consistent with simple feed-forward hierarchical processing models. That is, we have shown that 2-D and 3-D grouping cues play a critical role in modulating the observed reduction in perceived depth. Instead, our results are more consistent with current models of cortical processing that support recurrent feedback between mid- and low-level processing (among others, see Deco & Lee,
2002; Desimone & Duncan,
1995; Lee & Mumford,
2003). As outlined by Markov and Kennedy (
2013), there is compelling physiological and computational evidence for the existence, and importance, of both feed-forward and rapid feed-back networks in the visual system (Markov et al.,
2014; Samonds, Potetz, Tyler, & Lee,
2013). To account for our results as well as configuration-dependent threshold elevation (Fahle & Westheimer,
1988; McKee,
1983; Mitchison & Westheimer,
1984; Westheimer,
1979) disparity pooling must be constrained by feedback which provides an object or surface-based representation of the stimulus. Importantly, as a result of this object-based pooling, the visual system sacrifices the precision and accuracy afforded by the relatively smaller receptive field sizes in earlier visual areas. It is likely that the disparity-based depth discontinuities are initially encoded in area V2 (von der Heydt, Zhou, & Friedman,
2000) but are subject to pooling based on representations of surfaces in a variety of extrastriate areas including IT, MT, and CIP (Hegde & Van Essen,
2005; Janssen, Vogels, & Orban,
1999,
2000; Nguyenkim & DeAngelis,
2003; Rosenberg, Cowan, & Angelaki,
2013; Verhoef, Vogels, & Janssen,
2010).