Abstract
Deep neural networks have become the de facto models of human visual processing, but currently lack human-like representations of global shape information. For humans, it has been proposed that global shape representation starts with early mechanisms of contour integration. For example, people are able to integrate over local features and detect extended contours embedded in noisy displays, with high sensitivity for straight lines and systematically decreasing sensitivity as contours become increasingly curvilinear (Field et al., 1993). Here, we tested whether deep neural networks have contour detection mechanisms with these human-like perceptual signatures. Considering a deep convolutional neural network trained to do object recognition (Alexnet), we find that the pre-trained layer-wise feature spaces have little to no capacity to detect extended contours. However, when the network was fine-tuned to detect the presence or absence of a hidden contour, the fine-tuned feature spaces were able to perform contour-detection nearly perfectly. Further, using a gradient-based visualization method – guided backpropagation – we find that these fine-tuned classifiers are indeed identifying the full contour, rather than leveraging some unexpected strategy to succeed at the task. Critically, we also found that the scope of fine-tuning was key to achieving human-like contour detection: networks trained only to detect relatively straight contours naturally showed human-like graded accuracy to detect increasingly curvilinear contours, while networks fine-tuned to across the full range of curvature values, or at intermediate curvature levels only, showed distinctly non-human-like signatures, with peaks at the trained curvatures. These results provide a computational argument that human contour detection may actually rely on mechanisms solely designed to amplify relatively linear contours. Further, these results demonstrate that convolutional neural network architectures are capable of proper contour detection, but do not have the relevant inductive biases to develop these contour-integration mechanisms in service of object classification tasks.