Abstract
How does the mind extract morally relevant information from visual scenes? We present evidence that humans can make reliable moral judgments based on information presented in the blink of an eye: after viewers see briefly presented static scenes of two people interacting, they can correctly identify (i) who acted on whom (i.e., event role) and (ii) whether the observed event was harmful, and (iii) they can make a reliable moral wrongness judgment about a specific individual in the social interaction. Next, we find that a deep convolutional neural network model trained only to recognize objects can be used to produce moral judgments that are almost indistinguishable from those of humans, with only minimal additional training: a linear transform from its high-level features to the role, harm, and moral labels. We also find that earlier representational layers of this network can accurately predict role, but only later layers can predict harm and moral wrongness. Furthermore, this model shows patterns similar to human behavior (i) at the individual-scene level, and (ii) when confronted with a separate set of experimentally manipulated images for which it is more difficult for humans to identify who acted on whom. Based on these results, we argue that in the process of learning to recognize objects, the visual system also learns high-level visual features that can be used to make reliable moral judgments about observed events.
Meeting abstract presented at VSS 2018