Abstract
Eye movements provide insight into what parts of an image a viewer finds most salient, interesting, or relevant to the task at hand. Unfortunately, eye tracking data, a commonly-used proxy for attention, is difficult to collect at scale. Here, we present TurkEyes, a toolbox of crowdsourceable user interfaces to collect attention data without using an eye tracker. The four interfaces in our toolbox represent different interaction methodologies found in the literature for capturing attention. ZoomMaps (introduced here) is a "zoom-based" interface that tracks the viewport on a user's mobile phone while they pan and zoom. CodeCharts (inspired by Rudoy et al., 2012) is a "self-report" technique where participants specify where they gazed using a grid of codes that appears after image presentation. ImportAnnots (O’Donovan et al., 2014) is an "annotation" tool for selecting important image regions, and BubbleView (Kim et al., 2017) is a "cursor-based" moving-window approach that lets viewers click to reveal a small area of an otherwise blurred image. We place these interfaces within a common code and analysis framework to compare their output and develop guidelines for how to use them. We design experiments and validation procedures to capture high-quality data and explain how to convert the output of each method into an attention heatmap. Using Amazon's Mechanical Turk, we collect attention heatmaps on a variety of image types. Although all the interfaces capture some common aspects of attention, we find that they are best suited for different image types and tasks. For example, ZoomMaps is ideal for large, multi-scale visualizations; CodeCharts captures eye movements over time; ImportAnnots works well for graphic designs; and BubbleView is cheap but distorts the stimuli. This toolbox and our analyses facilitate exciting opportunities for gathering attention data at scale without an eye tracker for a diversity of stimuli and task types.