Abstract
Beyond recognizing objects, faces, and scenes, we can also recognize the actions of other people. Accordingly, a large literature explores how we make inferences about behaviors such as walking, reaching, pushing, lifting, and chasing. However, in addition to actions with physical goals (i.e., trying to *do* something), we also perform actions with epistemic goals (i.e., trying to *learn* something). For example, someone might press on a door to figure out whether it is locked, or shake a box to determine its contents (e.g., a child wondering if a wrapped-up present contains Lego blocks or a teddy bear). Such ‘epistemic actions’ raise an intriguing question: Can observers tell, just by looking, what another person is trying to learn? And if so, how fine-grained is this ability? We filmed volunteers playing two rounds of a ‘physics game’ in which they shook an opaque box to determine either (a) the number of objects hidden inside, or (b) the shape of the objects hidden inside. Then, an independent group of participants watched these videos (without audio) and were instructed to identify which videos showed someone shaking for number and which videos showed someone shaking for shape. Across multiple task variations and hundreds of observers, participants succeeded at this discrimination, accurately determining which actors were trying to learn what, purely by observing the box-shaking dynamics. This result held both for easy discriminations (e.g., 5-vs-15) and hard discriminations (e.g., 2-vs-3), and both for actors who correctly guessed the contents of the box and actors who failed to do so — isolating the role of epistemic *intent* per se. We conclude that observers can visually recognize not only what someone wants to do, but also what someone wants to know, introducing a new dimension to research on visual action understanding.