The asymmetry between contour detection and discrimination performance with respect to both single cue performance as well as cue summation gains provide support for the involvement of higher visual areas. Disparities between target detection and conscious perception of object shape have been found in numerous studies. Detection performance for objects can significantly exceed categorization (Bowers & Jones,
2008; Sagi & Julesz,
1984), the time course of object detection and categorization may be manipulated selectively (Mack, Gauthier, Sadr, & Palmeri,
2008), visual adaptation affects detection and identification performance differently (Hillis & Brainard,
2007), and activation in V1 has tight couplings with detection performance while identification correlates more with activation in later areas like LOC and the collateral sulcus (Straube & Fahle,
2011). Our results not only show a difference between detection and identification performance as such, but also a marked anisotropy regarding the magnitude of cue summation gains in both tasks. In both experiments and all conditions, we found contour discrimination to benefit more than detection from cue combination. Similar results have been reported in previous research on the combination of multiple cues in figure-ground segregation (Meinhardt et al.,
2006; Persike & Meinhardt,
2008) and correspond to neuroimaging studies which show that activation of higher ventral regions responsible for shape representation reacts stronger to the combination of features than does activation of early retinotopic areas (Altmann et al.,
2003). Even if super-linear salience gains are generated on V1 (see above) to aid detection, we propose that a second mechanism benefits even stronger from the conjunction of cues. This mechanism accomplishes contour completion and shape discrimination on a larger scale. Shape representation, independent of the type of feature cue and invariant to size and location, is found in higher ventral areas, such as the parietal cortex, inferio-temporal cortex (IT), and the LOC (Kourtzi & Huberle,
2005; Kourtzi & Kanwisher,
2000; Lerner, Hendler, Ben-Bashat, Harel, & Malach,
2001). The idea of a higher-level neural implementation of cue combination is backed by the first-ever study to find an electrophysiological correlate of feature synergy, starting no sooner than 130 ms after stimulus onset in IT and trickling down to earlier visual areas from there (Kida, Tanaka, Takeshima, & Kakigi,
2011). Moreover, the LOC responds to perceived global shape and is particularly sensitive to contour lines of objects. LOC responses are not modulated by the familiarity of objects, suggesting a stimulus driven analysis without reference to stored knowledge about specific object form (Kourtzi & Kanwisher,
2000). BOLD responses in V1 and V2 are strongly modulated by a change of local element orientation, but hardly by a change in global shape. Vice versa, the LOC responds most strongly to a change of global form, and moderately to local feature change (Kourtzi & Huberle,
2005). This ties in with recent evidence that contour-related effects found in early visual areas like V1 may not be the source of contour integration but a mere epiphenomenon (Chen et al.,
2014). The initial peak of contour-related activation was found on V4, followed by congruent activation in V1, probably evoked by feedback connections from V4.