Abstract
The context under which we view emotional face expressions modulate their interpretation. Of particular importance are the spoken words of a social partner, providing immediate contextual changes that guide socio-cognitive processing. Despite the salience of auditory cues, prior work has mostly employed visual context primes. These works demonstrate that language-driven semantic context modulates the perception of facial expressions at both the behavioural and neural level. While previous auditory context cues include interjections and prosodic manipulations, auditory situational semantic sentences have not yet been used. The current study investigated how such auditory semantic cues modulate face expression processing. In a within-subjects dynamic design, participants categorized the change in expression of a face (happy; angry) following congruently (e.g. positive sentence-happy face) and incongruently paired (e.g. positive sentence-angry face) auditory sentences presented concurrently with the neutral expression of the face. Neural activity was recorded, and Event Related Potential (ERP) components time-locked to the onset of the face expression included the P1 (70–120ms), N170 (120–220ms), Early Posterior Negativity (EPN: 200–400ms), and the Late Positive Potential (LPP; 400–600ms). Consistent with previous work, congruent trials elicited faster reactions times and fewer errors relative to incongruent trials. Typical emotion effects were found, whereby the EPN and LPP were enhanced for angry relative to happy faces. Interestingly, the N170 was enhanced for happy relative to angry faces, a finding opposite to typical results in the literature, although reported in previous visual context priming studies. No effects on the P1 and no interaction between sentence valence and face expression, were found. Thus, despite auditory semantic cues modulating the categorization of face expressions at the behavioural level, no interaction at the neural level was seen. Findings are discussed in the framework of distinct neural networks for processing auditory semantic cues and visual face expressions in dynamic audio-visual designs.
Acknowledgement: NSERC Discovery Grant