Abstract
Deefake videos, where faces are modified using deep learning, are becoming easier to make and more convincing, creating online security risks. Machine learning models are being developed to detect deepfakes, but human users often disregard model predictions (Groh et al, 2021). Here, we explore a novel visual indicator called Deepfake Caricatures, in which video artifacts are magnified, and show that it is both effective and subjectively convincing. In Experiment 1, participants viewed silent videos of people, and indicated whether each was real or fake. Deepfakes were either non-signaled, or signaled with the Caricature technique (between-subjects, N=180). Detectability was measured across conditions known to affect visual search and signal detection performance: a Baseline condition (E1a, 50% prevalence, no time limit), Low Prevalence (E1b, 20% prevalence), Time Pressure (E1c, 2 second time limit), and Noisy Visual Input (E1d, video compression noise). Across all conditions, sensitivity was much higher for Caricatures than non-signaled deepfakes (E1a: t = 30.08, p<0.001; E1b: t = 32.28, p<0.001; E1c: t = 39.62, p<0.001; E1d: t=30.58, p<0.001 ; unpaired t-tests), showing that the Caricatures manipulation is effective. Additionally, we compared the detectability of non-signaled deepfakes across experiments, and found that all conditions significantly reduced deepfake detection relative to Baseline, suggesting that the present literature overestimates deepfake detectability in the wild. In Experiment 2, participants (N = 298) viewed videos and rated their confidence that they were real or fake using a slider interface, then viewed a model’s prediction displayed either as text or in Caricatures form, and finally were allowed to update their responses. Participants updated their responses more often following a Caricature than a text-based indicator (t=4.30, p<0.001), showing that Caricatures is a more convincing visual indicator than text.