Purchase this article with an account.
Joseph Burling, Hongjing Lu, Greta Todorova, Frank Pollick; A comparison of eye-movement patterns between experienced observers and novices in detecting harmful intention from surveillance video. Journal of Vision 2016;16(12):1340. doi: 10.1167/16.12.1340.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
Understanding the intentions of others by viewing their actions in a complex visual scene is a challenging task. Does experience change looking behavior in that gaze to specific action cues creates a fixation "signature" unique to experienced observers? To address this question we analyzed eye movements between experienced surveillance (CCTV) operators and novices when observing social interactions from CCTV footage. 11 experienced operators and 10 novices observed 36 unique CCTV clips, while obtaining point-of-gaze coordinates from an eye tracker. Each clip was 16 seconds in duration, and classified based on one of four contexts. 'Fight' and 'Confront' clips displayed aggressive behavior, except that fighting actually occurred for 'Fight' clips after the end of the clip (not seen by the participants). 'Play' clips showed playful interactions between people with no aggression, while 'Nothing' clips showed typical everyday behavior with no aggression. We used a sequence matching algorithm to compare eye-movements (scan paths) between individuals examining the same video. We split sequences into segments (first, middle, and last) to analyze how viewing behavior changes with the accumulation of visual input over time. We found that on average, experienced operators yielded higher fixation similarity scores suggesting in-group consistency among experienced observers, with the consistency varying by context and time segment. For both the 'Confront' and 'Fight' contexts, differences in gaze patterns between experienced observers and novices were largest for the middle segment, whereas for the 'Play' context, the largest difference was at the end. The 'Nothing' context yielded higher scores for experts but no time differences. These results suggest that expert CCTV operators predict actions similarly by attending to the relevant events within a scene, and for critical events such as aggressive behaviors, important cues (markers that result in signature looking patterns) are attended well before the onset of the disruptive action.
Meeting abstract presented at VSS 2016
This PDF is available to Subscribers Only