Abstract
Contextual associations play an important role in human vision and understanding. For example, objects that are contextually congruent with the environment are recognized faster. Do these same contextual associations play a role in artificial vision? If so, we would expect contextual associations between objects (e.g., tent – sleeping bag) to be included in the object representations within a convolutional neural network (CNN) designed for object recognition, even when the network is not explicitly trained to recognize contextual associations. To test this, we examined the similarity in CNN representations between pairs of contextually related object pairs (N=73). Stimuli were photographs of objects presented against a white background to ensure contextual associations are not merely a product of background information. Across each layer of the CNN, we compared the similarity in object representations across pairs of contextually related objects and unrelated objects. We found that across almost all layers of the CNN (except the first) contextually related objects had more similar representations across the units of the CNN than unrelated objects. This was true across 10 different CNNs tested that varied in number of layers, training data, and recurrent/non-recurrent architecture. Critically, we compared these context representations to human behavior to determine whether the contextual associations represented in a CNN were relevant to human vision. We found the similarity of object representations due to contextual associations correlated with human judgments on the relatedness of contextually related object pairs. The more similar the object representation in the CNN, the faster and more likely humans labeled the objects as contextually related. Most interestingly, despite context being represented across almost all layers of the CNN, correlation with behavior only emerged at the later layers. This segmentation in model--behavioral correlation suggested that only high level or complex regularities relating to context are relevant to human behavior.