RT Journal Article A1 Hafri, Alon A1 Landau, Barbara A1 Bonner, Michael F A1 Firestone, Chaz T1 When a phone in a basket looks like a knife in a cup: Perception and abstraction of visual-spatial relations between objects JF Journal of Vision JO Journal of Vision YR 2019 DO 10.1167/19.10.160a VO 19 IS 10 SP 160a OP 160a SN 1534-7362 AB Our minds effortlessly recognize the objects and environments that make up the scenes around us. Yet scene understanding relies on much richer information, including the relationships between objects—such as which objects may be in, on, above, below, behind, or in front of one another. Such spatial relations are the basis for especially sophisticated inferences about the current and future physical state of a scene (“What will fall if I bump this table?” “What will come with if I grab this cup?”). Are such distinctions made by the visual system itself? Here, we ask whether spatial relations are extracted at a sufficiently abstract level such that particular instances of these relations might be confused for one another. Inspired by the observation that certain spatial distinctions show wide agreement across the world’s languages, we focus on two cross-linguistically “core” categories—Containment (“in”) and Support (“on”). Subjects viewed streams of natural photographs that illustrated relations of either containment (e.g., phone in basket; knife in cup) or support (e.g., spoon on jar; tray on box). They were asked to press one key when a specific target image appeared (e.g., a phone in a basket) and another key for all other images. Although accuracy was quite high, subjects false-alarmed more often for images that matched the target’s spatial-relational category than for those that did not, and they were also slower to reject images from the target’s spatial-relational category. Put differently: When searching for a knife in a cup, the mind is more likely to confuse these objects with a phone in a basket than with a spoon on a jar. We suggest that the visual system automatically encodes a scene’s spatial composition, and it does so in a surprisingly broad way that abstracts over the particular content of any one instance of such relations. RD 4/20/2021 UL https://doi.org/10.1167/19.10.160a