Open Access
Article  |   June 2023
Head and body cues guide eye movements and facilitate target search in real-world videos
Author Affiliations
  • Nicole X. Han
    Department of Psychological and Brain Sciences, Institute for Collaborative Biotechnologies, University of California, Santa Barbara, CA, USA
    xhan01@ucsb.edu
  • Miguel P. Eckstein
    Department of Psychological and Brain Sciences, Institute for Collaborative Biotechnologies, University of California, Santa Barbara, CA, USA
    migueleckstein@ucsb.edu
Journal of Vision June 2023, Vol.23, 5. doi:https://doi.org/10.1167/jov.23.6.5
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Nicole X. Han, Miguel P. Eckstein; Head and body cues guide eye movements and facilitate target search in real-world videos. Journal of Vision 2023;23(6):5. https://doi.org/10.1167/jov.23.6.5.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Static gaze cues presented in central vision result in observer shifts of covert attention and eye movements, and benefits in perceptual performance in the detection of simple targets. Less is known about how dynamic gazer behaviors with head and body motion influence search eye movements and performance in perceptual tasks in real-world scenes. Participants searched for a target person (yes/no task, 50% presence), whereas watching videos of one to three gazers looking at a designated person (50% valid gaze cue, looking at the target). To assess the contributions of different body parts, we digitally erase parts of the gazers in the videos to create three different body parts/whole conditions for gazers: floating heads (only head movements), headless bodies (only lower body movements), and the baseline condition with intact head and body. We show that valid dynamic gaze cues guided participants’ eye movements (up to 3 fixations) closer to the target, speeded the time to foveate the target, reduced fixations to the gazers, and improved target detection. The effect of gaze cues in guiding eye movements to the search target was the smallest when the gazer's head was removed from the videos. To assess the inherent information about gaze goal location for each body parts/whole condition, we collected perceptual judgments estimating gaze goals by a separate group of observers with unlimited time. Observers’ perceptual judgments showed larger estimate errors when the gazer's head was removed. This suggests that the reduced eye movement guidance from lower body cueing is related to observers’ difficulty extracting gaze information without the presence of the head. Together, the study extends previous work by evaluating the impact of dynamic gazer behaviors on search with videos of real-world cluttered scenes.

Introduction
Gaze information orients overt attention
The gaze direction, head orientation, and body posture of people around us reveal essential information about their internal mental states, intentions, and potential future actions (Azarian, Buzzell, Esser, Dornstauder, & Peterson, 2017; Bayliss, di Pellegrino, & Tipper, 2004a; Emery, 2000; Kleinke, 1986). Daily social interactions involve inferring others’ attention to plan one's future actions. Humans automatically follow with eye movements the gaze direction of others when trying to infer the locus of attention of other individuals. This gaze-following behavior is present in infants as early as 10 months old (Brooks & Meltzoff, 2005). Similar behaviors are also widely observed in nonhumans, such as apes, monkeys, dogs, and goats (Bräuer, Call, & Tomasello, 2005; Brooks & Meltzoff, 2005; Kaminski, Riedel, Call, & Tomasello, 2005; Senju & Csibra, 2008; Shepherd, 2010; Wallis et al., 2015). 
It is difficult for humans to ignore others’ eye and head gaze shifts. Studies have shown that centrally presented gaze, head, and body posture induce attention shifts even when the gaze direction is nonpredictive of the target location (Bayliss et al., 2004a; Driver et al., 1999; Friesen & Kingstone, 1998; Friesen, Ristic, & Kingstone, 2004; Hietanen, 1999; Kingstone, Friesen, & Gazzaniga, 2000; McKee, Christie, & Klein, 2007; Ristic, Wright, & Kingstone, 2007). Therefore, most agree to describe gaze shift as an exogenous cue because of its robust effect on shifting attention. However, their temporal development is different. Exogenous attention is defined to be involuntary, transient, and usually triggered by sudden changes in the environment. The effect of exogenous attention typically peaks within 100 ms and quickly dissipates around 150 ms to 200 ms (Cheal & Lyon, 1991; Maylor & Hockey, 1985; Mulckhuyse & Theeuwes, 2010; Müller & Findlay, 1988; Müller & Rabbitt, 1989; Nakayama & Mackeben, 1989; Posner & Cohen, 1984; Shepherd & Müller, 1989; Theeuwes, 1991). However, the time course of covert attention for gaze cues is different. Compared to a typical peripheral exogenous cue (e.g. a flash in the visual periphery), the effect of gaze cue appears as early as 100 ms, persists to 300 to 500 ms from cue onset, and decays afterward (Bayliss et al., 2004a; Friesen & Kingstone, 1998; McKee et al., 2007; Müller & Findlay, 1988; Posner & Cohen, 1984; Shepherd & Müller, 1989; Theeuwes, 1991). 
Gaze as a dynamic behavior
The majority of studies use static images of the eyes, face, or body postures to study the cueing effects on both covert attention (Bayliss, di Pellegrino, & Tipper, 2004b; Driver et al., 1999; Friesen & Kingstone, 1998) and overt attention (Friesen & Kingstone, 2003; Hood, Willen, & Driver, 1998; Mansfield, Farroni, & Johnson, 2003; Ricciardelli, Bricolo, Aglioti, & Chelazzi, 2003), and perceptual performance with simple tasks such as dot detection or letter identification. However, gaze-following is fundamentally a dynamic behavior. Some studies have used simplified dynamics of gaze behaviors, including moving point lights or a single animation of someone's face (Hermens & Walker, 2012; Kuhn & Tipples, 2011; Rutherford & Krysko, 2008; Shi, Weng, He, & Jiang, 2010; Sun, Stein, Liu, Ding, & Nie, 2017; Wang, Yang, Shi, & Jiang, 2014). Recently, a few studies have also started to use realistic videos to study how gaze cues affect attention in natural scenarios (Gregory, 2021; Han & Eckstein, 2022). 
Gaze cueing and visual search
The role of gaze following in more complex tasks, such as visual search in cluttered scenes, has not been studied. Observer's performance during the visual search is degraded by the spatial uncertainty of the target (Bochud, Abbey, & Eckstein, 2004; Burgess & Ghandeharian, 1984; Shimozaki, Schoonveld, & Eckstein, 2012; Swensson & Judy, 1981), distractors that are potentially confused with the target (Nagy, Neriani, & Young, 2005; Palmer, Verghese, & Pavel, 2000), the effect of clutter (Henderson, Chanceaux, & Smith, 2009; Rosenholtz, 2016), visual processing in the visual periphery (Deza & Eckstein, 2016; Lago, Sechopoulos, Bochud, & Eckstein, 2020; Michel & Geisler, 2011; Rosenholtz, 2016; Semizer & Michel, 2017; Strasburger, Rentschler, & Jüttner, 2011), search inefficiencies (Lago et al., 2021; Morvan & Maloney, 2012; Verghese, 2012), and decision suboptimalities (Mitroff & Biggs, 2014; Wolfe, Horowitz, & Kenner, 2005). To mitigate search errors, humans use target features, cues, and context that are predictive of target locations to orient covert attention, and guide the foveal region of the eye toward task-relevant locations (Bravo & Farid, 2009; Castelhano & Heaven, 2010; Eckstein, 2017; Eckstein, Beutter, & Stone, 2001; Eckstein, Drescher, & Shimozaki, 2006; Findlay, 1997; Koehler & Eckstein, 2017a; Malcolm & Henderson, 2009; Neider & Zelinsky, 2006; Torralba, Oliva, Castelhano, & Henderson, 2006; Võ, Boettcher, & Draschkow, 2019; Võ, 2021; Wolfe, Võ, Evans, & Greene, 2011). 
Gaze shifts of individuals in a scene (i.e. gazers) can play an important role in guiding and facilitating visual search. In these scenarios, the gazer does not typically appear in an observer's central vision. More often the gazer appears at the observer's visual periphery. Inferring the gaze direction of others in these situations requires taking into account not only the gazer's eye region, which might not be visible in the periphery (Loomis, Kelly, Pusch, Bailenson, & Beall, 2008) but the head orientation, body postures, as well as the dynamics of the head and body movements. Studies have found that the combination of eyes, head, and body influences attentional shifts. For example, studies have found a stronger effect in both orienting overt (Azarian et al., 2017) and covert attention (Bayliss et al., 2004a; Hietanen, 1999; Hietanen, 2002) when either the head orientation and eye gaze direction are incompatible (e.g. the head rotates to the right but the eyes look toward the front/left) or when the head and body orientations are incompatible (see review Frischen, Bayliss, & Tipper, 2007). The effect of covert attention at the gazed location is also more temporally sustained attention when the head and body are present (Han & Eckstein, 2022). 
To our knowledge, no study has investigated the effect of dynamic gaze on multiple eye movements visual search with various gaze cue eccentricities and real-world scenes. Here, we evaluate the contributions of a gazer's head and body cues in guiding eye movements and facilitating the search for a target person with videos of real scenes. If the gazer is looking to the left side of the image while the gaze goal (the designated gazed person) is on the opposite side, then we should expect the gaze-following eye movements might impact visual search. The results could shed light on the connection between available gaze information in the video, active eye movement planning, and behavioral performance in visual search tasks. 
In addition, we used digital video editing techniques to erase the head or lower body of the gazers and replace them with the immediate background and to create three different experimental conditions (gazer intact, floating heads, and headless bodies; Figure 1). This experimental manipulation allowed us to isolate the effect of gazers’ head and lower body movements separately on eye movements during visual search. At the beginning of the videos, only the gazers’ behaviors were visible to the observers to make sure the eye movements planning was only dependent on the gaze information. After the gaze behavior stopped and observers were allowed to make eye movements to search for the target, we then showed the multiple distractor people along with the target (only in the target present condition). If only one person (either a distractor or the target) appeared suddenly after gaze behaviors were completed, the sudden onset could naturally interrupt natural eye movements and attract the observer's attention regardless of whether the observer was following the gaze direction or not. Therefore, we presented multiple distractors rather than just a single individual for the search task. 
To further explore how eye movement planning during gaze-following was related to available gaze information (e.g. direction of the head and lower body) in the videos, we collected a separate dataset where people made explicit judgments about the location at which gazers’ looked at (gaze goal). We considered these judgments as estimates of the upper limits of the information available to the saccade system to guide eye movements during the search. By comparing the perceptual judgments of gaze goals to the eye movements during the search, we could assess whether the eye movement errors with various head/body cues are related to the inherent perceptual information about gaze-goal available in the head/body of gazers. 
Materials and methods
Search task
Subjects
Twenty undergraduate students (aged = 18–21, 12 women and 8 men) from the University of California, Santa Barbara, CA, were recruited as subjects for course credits in this experiment. All had normal to corrected-to-normal vision. All participants signed consent forms to participate in the study. Individuals filmed in the videos signed consent forms authorizing the presentation of their images in the scientific publication and presentations of the study. The study was approved by the Institutional Review Boards at the University of California, Santa Barbara, CA. 
Stimuli and instruments
The stimuli consisted of 60 clips from in-house videos (approximately 3 seconds in length) that were originally filmed at the University of California, Santa Barbara, CA, campus. Videos included both indoor and outdoor scenes, such as classrooms, outside campus buildings, dining halls, etc. In each video clip, there were multiple students instructed during filming to look toward a designated person at the same time. We refer to the individuals in the videos shifting their gaze as “gazers.” The designated gazed person (“gaze goal”) could be the target person that observers were looking for (50% of trials), or could be a distractor person (the remaining 50% of trials). The target person was the same individual across all videos and subjects (see Figure 1). The videos were filmed on different days. Thus, the target person could appear with different clothing across the videos. 
For each video clip, we first extracted individual frames. Then, we manually segmented each individual's head and body outlines. We randomly selected some gazers to be digitally deleted during the initial presentation of the video (gaze cueing process), along with the gaze goal person (person who was gazed at by all the gazers). In order to do this, we picked out the frames for the gaze-orienting process and replaced the red, green, and blue (RGB) values of pixels contained by the outline of the individuals to be deleted with the RGB values of those pixels of the immediate background (available from other frames without the individuals). This method allowed us to delete some gazers and the target/distractor individually from the initial portion of the video frames prior to the end of the gazers’ head movement. By changing the selected gazers, we were able to create multiple versions of a video. For each video, the gaze goal person and the distractors were digitally deleted from the video frames, and appeared either 200 ms or 500 ms (stimulus onset asynchrony [SOA]) after the completion of the gazer's head movements. Finally, processed frames were compiled to create videos that have only one to three individuals orienting their gaze, heads, and bodies toward a point in the scene, followed by the appearance of two to four individuals (target and/or distractor individuals) after a 200 or 500 ms delay (see Figure 1). See Figure A1 for more video frame examples for invalid cue or target absent. Out of all the videos, 38% had the same number of distractors on each side of the image (left versus right of the central fixation), 28% had the majority of distractors on the left side, and 33% had the majority of distractors on the right side. In addition, for those videos which had an imbalanced number of distractors on two sides, there was no relationship between the gazed target person's location (image side) and the majority of distractors (χ2 = 0.75, p = 0.39). Therefore, there was no bias in the distractor locations that could be used by participants as a cue to orient eye movements and guide eye movements toward the target. 
In addition to the condition where gazers’ heads and bodies were present (intact condition), we also created another two conditions referred to as floating heads and headless bodies, where gazers’ bodies or heads were digitally deleted during the gaze-orienting process, respectively, and replaced by the background (see Figures 1b, c). Different conditions manipulated the head/body features present in the videos for the gazers (but not the target/distractor individuals). In summary, we created videos for three conditions: (1) intact videos, (2) floating head videos (gazers’ bodies were invisible), and (3) headless body videos (gazers’ heads were invisible). In all videos, we retained the immediate background behind the erased heads or bodies (see Figure 1). See Figure A1 for more examples. 
All videos were presented at the center of the computer screen with a visual angle of 18.4 degrees × 13.8 degrees (width x height). Participants’ eyes were 75 cm away from a Barco MDRC 1119 monitor (1280 × 1024 pixels). All participants’ left eyes were tracked by a video-based eye tracker (SR Research Eyelink 1000 plus Desktop Mount) with a sampling rate of 1000 Hz. Their eye movements were calibrated and validated before the experiment. Events in which velocity was higher than 35 degrees/second and acceleration exceeded 9500 degrees/second² were recorded as saccades. 
Procedure
Subjects were asked to judge whether a target person was present or absent in the videos. The target was a specific person present in 50% of the videos. Observers were first given unlimited time to familiarize themselves with pictures of the target person in different clothing outfits (see Figure 2a). Then, they completed a practice session with 10 videos in order to make sure they were able to identify the target person. There was only one target person across all trials. These practice videos were different from the videos for the main experiment. 
Participants then proceeded to complete the main experiment with three conditions (1 = intact; 2 = floating heads; and 3 = headless bodies) in a random blocked order within one sitting. Videos were presented randomly within each block (condition). Each session included a complete set of all three conditions. Each subject finished two sessions, thus resulting in 360 trials (60 trials/condition × 3 conditions/session × 2 sessions) total. Participants were required to complete the eye tracker nine-point calibration and validation before the experiment started. Before the initiation of a trial, observers would recalibrate and revalidate if there were any large eye drifts detected that caused failure in maintaining fixation (>1.5 degrees visual angle). 
On each trial, the participants were instructed to fixate on the central cross while pressing the space bar to start the video. Once the video started, the central cross stayed on the screen, and participants were instructed to fixate on the cross without eye movements while gazers shifted their gaze. If an eye-position deflection greater than 1.5 degrees visual angle from the fixation cross was detected during the gaze shift, that trial was aborted. At the moment when all the gazers looked at the designated person and stopped moving, the central cross disappeared, and observers were free to make eye movements. Either 200 ms or 500 ms delay after the central cross disappeared, the target with distractors (target-present trials) or all distractors (target-absent trials) were digitally re-inserted into the videos for 1000 ms before the response screen (see Figure 2b). 
Finally, participants made the response by pressing a key to indicate whether the target person was present or absent (see Figure 2b). Pictures of the target person were presented for reference when they made a response after each video. 
Explicit gaze estimates task
Subjects
One hundred subjects (age over 18 years) were recruited with a human interface task (HIT) posted on Amazon Mechanical Turk (Mturk). The study was approved by the Institutional Review Boards at the University of California, Santa Barbara, CA. All subjects consented to participate in the experiment. 
Stimuli
Stimuli consisted of individual frames from the same videos presented in the eye-tracking search experiment. Specifically, for each of the 60 videos in each condition (intact, floating heads, and headless bodies), the frame in which all gazers directly looked at the same person was extracted. However, the gazed-at person was deleted from the frame and was replaced by the background pixel values to produce images with minimal visible manipulations to the observers. There were 180 different images (60 videos × 3 conditions) in total. 
Procedures
Subjects were asked to make an explicit perceptual estimation of a gazer's goal by selecting locations on the image where they thought all the gazers were looking. Subjects were informed that the gaze goal target was removed from each image. Their task was to make the best judgment about where the gazers were looking within the scene. Sixty images from 60 videos were presented in random order for each subject. Subjects had unlimited viewing time. Motor errors in the spatial selection of the gaze goal location could be corrected before proceeding to the next image. For each image, the condition (intact, floating heads, and headless body) of the image was selected randomly on each trial. Importantly, each subject could only see an image in a single condition to prevent interference from memory from multiple viewings of an image. 
Data analysis
Eye movements of search task
We used a within-subject 3 × 2 × 2 ANOVA to measure the effects of condition (intact, floating heads, and headless bodies), cue validity (valid and invalid), and SOA (200 ms and 500 ms) on the first saccade latency, fixation distance toward the target person. We used bootstrap tests to evaluate if participants were more often to make fixations on the same side of the gaze goal (follow gaze cue), and the proportion of trials where they foveated on the gaze goal/gazers (within 2 degrees visual angle). All p values for Tukey post hoc tests were corrected using false discovery rate (FDR). 
Comparison of eye movements and explicit gaze goal estimates
In order to measure the relationship between the spatial distribution of eye movements and explicit perceptual judgments about gaze goals, we created fixation maps from the eye movements from the search task and perceptual judgment maps from spatial selections of the explicit gaze estimates study. For each video in the eye-tracking experiment, all subjects’ fixations were collected to create a fixation map. Similarly, for each image from the Mturk experiment, all subjects’ spatial selections were collected to create a perceptual judgment map. The fixation and gaze goal estimate maps were all smoothed with a Gaussian filter (standard deviation of 20 pixels) and were then normalized to sum to one. We measured the similarity of the fixation and gaze estimate maps for a video by taking the normalized dot product of the maps. We used permutation tests (10,000 permutations) across videos for the pairing of fixation and perceptual judgment maps to obtain a distribution of dot products that one might expect by chance. To assess differences in eye movement distributions across different conditions (intact, floating heads, and headless bodies), we also ran 1-way ANOVA tests on the dot product between intact gaze goal estimate maps and the fixation maps from all three conditions. 
Target detection performance for search task
We examined the effect of gaze cueing on search detection performance. We conducted a three-way within-subject ANOVA to test the effects of condition (intact, floating heads, and headless bodies), cue validity (valid and invalid), and SOA (200 ms and 500 ms) on the hit rate (proportion of correctly detected target-present trials). We conducted a within-subject 2-way ANOVA (condition × SOA) on false positive rates. We corrected the p values using FDR for the Tukey post hoc tests to reduce the probability of making a type I error. Multiple one-sample t-tests were implemented to test the significance of ∆d’ (valid d’ – invalid d’) with FDR correction. Then a 1-way ANOVA on sensitivity difference ∆d’ was conducted to test differences among the three conditions. 
Relationship between eye movements and performance for search task
Finally, we computed a correlation between the cueing effect for behavioral performance (mean difference, valid-invalid, in sensitivity d’) and the cueing effect for eye movements (the mean difference, invalid-valid, in the distance of the closest fixation to the target). 
Results
First saccade latency is affected by SOA delay
In most trials, observers made two to three saccades within 1000 ms when they searched for the target in the videos (Figures 3a,b). A three-way (condition × SOA × cue validity) within-subject ANOVA showed that there was a main effect of SOA (200 vs. 500 ms) on the first saccades’ latency F(1, 19) = 27.56, p = 4.56e-05 (Figure 3c). No effect of cue validity (F(2, 38) = 2.0, p = 0.15) or condition (F(2, 38) = 0.26, p = 0.78) was found on the first saccade latency. The FDR post hoc corrections showed significantly longer saccade latency in trials with longer SOA delay than that with shorter SOA delay in all three conditions. In the intact condition, 500 ms SOA, m = 240.1 ms, SE = 10.6 ms; 200 ms SOA, m = 198.0 ms, SE = 10.11 ms, p = 0.047; in the floating heads condition, 500 ms SOA, mean = 249.7 ms, SE = 12.1 ms; 200 ms SOA, m = 191.8 ms, SE = 9.4 ms, p = 0.002; and, in the headless bodies condition, 500 ms SOA, m = 242.5 ms, SE = 10.6 ms versus 200 ms SOA, m = 185.0 ms, SE = 10.9 ms; p = 0.0021. Thus, subjects took longer to execute the first saccades after processing the gaze cueing while fixating at the central cross when the target/distractor people appeared with a longer SOA delay. 
Fixations are guided by gaze cues
We first analyzed the effect of the gaze cue in orienting the first fixation. Figure 4 shows examples of heatmaps of first fixation positions for the three conditions for valid and invalid gaze cue trials. A three-way (condition × SOA × cue validity) within-subject ANOVA showed a significant main effect of cue validity on the distance of the first fixations to the target person, F(1, 19) = 85.39, p < 0.001, and a significant interaction between condition and cue validity, F(2, 38) = 15.59, p < 0.001. The FDR corrected post hoc Tukey tests showed significantly shorter distances between the first fixation and the target person when the cue was valid versus when the cue was invalid in all three conditions (Figure 5a, Table 1 for details). With valid gaze cues, the distance to the target was significantly higher in the headless bodies condition compared to the intact condition, p = 0.014, but no other difference was found (intact versus floating heads, p = 0.80, floating heads versus headless bodies, p = 0.29). The length of the SOA did not affect the location of the first fixations (F(1, 19) = 0.68, p = 0.42). 
Table 1.
 
All fixation distance to the target summary (with standard errors across subjects in parentheses and p values below the standard errors). All p values for Tukey post hoc tests were corrected using false discovery rate (FDR). BOLD p values were significant.
Table 1.
 
All fixation distance to the target summary (with standard errors across subjects in parentheses and p values below the standard errors). All p values for Tukey post hoc tests were corrected using false discovery rate (FDR). BOLD p values were significant.
Similarly, we found a significant main effect of cue validity on the second fixation's distance to the target person, F(1, 19) = 59.96, p = 2.71e-07), and a significant interaction between the cue validity and body part condition, F(2, 38) = 6.46, p = 0.004 (Figure 5b). Post hoc tests showed a higher distance to the target when the cue was invalid in all three conditions (see Table 1 for details). In addition, given a valid cue, distance to the target was highest in the headless bodies condition compared to both the intact condition, p = 0.028, and the floating heads condition, p = 0.028, but no difference between the intact and floating heads, p = 1.00. The cueing effect on eye movement guidance toward the target persisted for the third fixations (see Table 1 for details). For the fourth fixations, there were fewer trials, and we did not observe any significant effect (see Figure 5c for the first to the fourth fixation across all conditions). 
To further quantify the influence of the gaze cue in guiding eye movements among the three conditions, we calculated the difference between the fixations’ distance to the target between cue valid and invalid (∆distance = distance invalid – distance valid; Figure 5d). One-way ANOVA on ∆distance for the first fixation showed a main effect of condition for the first fixation, F(2, 38) = 15.59, p = 1.14e-05. FDR-corrected post hoc tests showed a significantly higher ∆distance in the intact condition compared to that of the floating heads condition, p = 0.024, as well as compared to that of the headless bodies condition, p = 1.42e-06 (see detailed results in Table 2). In addition, ∆distance in the floating heads condition was significantly higher than that of the headless bodies condition, p = 0.024. A similar result was found for the second fixation, F(2, 38) = 6.46, p = 0.0038, but no main effect of condition was found for the third fixation, F(2, 55) = 0.72, p = 0.49, and fourth fixation, F(2, 37) = 0.69, p = 0.51 (see Table 2Figure 5d). 
Table 2.
 
Mean ∆distance (invalid-valid) to the target for all the conditions and fixations, with standard errors across participants, and p values for comparisons against 0 (corrected by FDR). BOLD p values were significant.
Table 2.
 
Mean ∆distance (invalid-valid) to the target for all the conditions and fixations, with standard errors across participants, and p values for comparisons against 0 (corrected by FDR). BOLD p values were significant.
To further understand whether the first eye movement was indeed directed toward the gaze goal, we calculated the proportion of trials where the fixation was located on the same side (left/right) as the gaze goal (note that the starting point was always the fixation cross at the center of the image; Figure 5e). We found that the proportion of first fixations directed to the side of the image with the gaze goal was significantly greater than 0.5 (bootstrap test with 10,000 samples, detailed results in Table 3) indicating that the participants were indeed following the gaze direction most times. The one exception was the invalid cue trials in the headless bodies condition which was not significantly different than 0.5 (p = 0.36). 
Table 3.
 
The proportion of trials where 1st fixation was located at the same side (left/right) as the gaze goal, and bootstrap p values for comparisons against 50% (corrected by FDR). BOLD p values were significant.
Table 3.
 
The proportion of trials where 1st fixation was located at the same side (left/right) as the gaze goal, and bootstrap p values for comparisons against 50% (corrected by FDR). BOLD p values were significant.
A two-way (gazer's body parts/whole condition times cue validity) within-subject ANOVA test on the percentage of trials in which subjects foveated the target person's head (within 2 degrees visual angle) showed a significant main effect of body parts/whole condition F(2, 38) = 11.09, p = 1.61e-04, cue validity F(1, 19) = 38.55, p = 5.78e-06. There was also a significant interaction between the gazer's body parts/whole condition and cue validity F(2, 38) = 6.24, p = 0.005. FDR corrected post hoc test showed a statistically significantly higher proportion of trials with fixations foveating on the target person in the valid trials versus invalid trials in intact (valid = 37.77%, SE = 3.73%, invalid = 14.35%, SE = 1.88%, p = 7.83e-06), floating heads condition (valid = 34.94%, SE = 3.97%, invalid = 15.31%, SE = 2.42%, p = 3.98e-04), as well as in the headless bodies condition (valid = 23.45%, SE = 2.75%, invalid = 13.23%, SE = 1.86%, p = 2.67e-05; Figure 6a). Similar results can be found if we adjust our criterion of foveation from within 2 degrees to within 3 degrees of the target (see Figure A2). For the subset of trials in which participants foveated at the target person, we also found a significant main effect of cue validity on the time it took to foveate the target, F(1, 110) = 4.2, p = 0.043 (Figure 6b). The post hoc test showed a trend for longer times for participants to foveate the target with invalid gaze cues in the intact condition, but did not reach significance (invalid 0.55 seconds versus valid 0.40 seconds, p = 0.08; Figure 6b). No difference was found in the other conditions (floating heads, invalid = 0.45 seconds, valid = 0.40 seconds, p = 0.14; headless bodies, invalid = 0.51 seconds, valid = 0.46 seconds, p = 0.65). 
Relationship between eye movement guidance and explicit perceptual estimates of gaze goals
We first quantified the perceptual estimation errors (data from the explicit gaze estimates task) for the intact, floating heads, and headless bodies conditions using the Mturk-collected perceptual judgment maps of images with the gaze goals digitally deleted. This allowed us to assess the inherent information about gaze goal that observers can perceptually extract from the heads, lower bodies and their joint presence (intact). The subjects making these perceptual estimates were different than the observers participating in the search task. 
As an error metric for the perceptual judgments, we calculated the distance of the peak of the density map of the perceptual estimates to the known gaze goal (location the gazer was looking at). A 1-way ANOVA found a significant main effect of body parts/whole condition on perceptual estimation error F(2, 150) = 27.76, p = 5.54e-11. FDR corrected post hoc tests showed statistically significant higher error in the headless bodies condition (error = 498.19 pixels and SE = 17.79 pixels) than that in the intact (error = 358.83 pixels and SE = 13.67 pixels), p = 9.31e-10, and the floating heads (error = 370.65 pixels and SE = 11.88 pixels) condition, p = 1.21e-08 (Figure 7). There was no difference between the intact condition and the floating heads condition, p = 0.57. The results indicate that the presence of heads improves gaze estimation accuracy. 
In order to investigate the relationship between perceptual judgments and eye movement fixation locations, we computed a quantitative measure of similarity (normalized dot products) of fixation maps (mean fixations per image = 72, SD = 21) and perceptual judgment maps (mean estimates per image = 32, SD = 4). The matched image pairs normalized dot product for intact fixation-intact perceptual judgments, floating heads-floating heads, and headless bodies-headless bodies were 0.37, 0.38, and 0.38 correspondingly. The unmatched were 0.22 for the intact (averaged across intact-floating heads and intact-headless bodies normalized dot products), 0.24 for the floating heads (averaged across floating heads-intact and floating heads-headless bodies), and 0.2 for the headless bodies (averaged across floating headless bodies-intact and headless bodies-floating heads). 
The normalized dot product between fixation maps and perceptual judgment maps for the matched image pairs within each condition were significantly higher relative to the unmatched image pairs (see Figure A3 all p < 1e-05 based on 10,000 randomly permutated pairs). This implies, that the body parts/whole manipulations similarly influenced eye movement patterns during the search and the explicit perceptual estimations of gazer goals (under no time constraints). 
Figure 1.
 
Example frames from videos with a valid gaze cue. In the videos, the gazers (G) looked at the same designated person (gaze goal). After 200 ms or 500 ms, the target person (T) and some other distractor persons (D) appeared in the video. The letter notations were not included in the actual video stimuli during the experiment and are only presented here for illustration purposes. (a) Intact condition: gazer (G) is intact (b) floating heads condition: gazer (G) has floating heads without a body. (c) Headless body condition: gazer (G) has headless bodies. Invalid gaze cue videos were similar except that the gazers looked at a location where a distractor individual rather than the target was presented.
Figure 1.
 
Example frames from videos with a valid gaze cue. In the videos, the gazers (G) looked at the same designated person (gaze goal). After 200 ms or 500 ms, the target person (T) and some other distractor persons (D) appeared in the video. The letter notations were not included in the actual video stimuli during the experiment and are only presented here for illustration purposes. (a) Intact condition: gazer (G) is intact (b) floating heads condition: gazer (G) has floating heads without a body. (c) Headless body condition: gazer (G) has headless bodies. Invalid gaze cue videos were similar except that the gazers looked at a location where a distractor individual rather than the target was presented.
Figure 2.
 
(a) Photograph of the target person and an accompanying close-up shot. Participants were shown the target person's photos on the response screen. (b) Timeline for each trial. The participants were required to fixate at the center cross and press the space bar to initialize the trial. The video started with one or multiple gazers looking at the same person. Once the gazer's head/body movement ended, the central cross disappeared, and observers were free to make eye movements. Either 200 ms or 500 ms after the disappearance of the central cross, other individuals (target with distractors or distractors only) appeared in the video for 1000 ms. Participants indicated whether the target person was present or absent in the video. Observers were free to execute eye movements and were given no instructions related to search strategies.
Figure 2.
 
(a) Photograph of the target person and an accompanying close-up shot. Participants were shown the target person's photos on the response screen. (b) Timeline for each trial. The participants were required to fixate at the center cross and press the space bar to initialize the trial. The video started with one or multiple gazers looking at the same person. Once the gazer's head/body movement ended, the central cross disappeared, and observers were free to make eye movements. Either 200 ms or 500 ms after the disappearance of the central cross, other individuals (target with distractors or distractors only) appeared in the video for 1000 ms. Participants indicated whether the target person was present or absent in the video. Observers were free to execute eye movements and were given no instructions related to search strategies.
Figure 3.
 
(a) The frequency of the first saccade starting time relative to the time the gazer's head movement stopped (b). Histogram of the number of saccades made during 1000 ms target/distractor presentation (c). First saccade latency for 200 ms and 500 ms SOA delay.
Figure 3.
 
(a) The frequency of the first saccade starting time relative to the time the gazer's head movement stopped (b). Histogram of the number of saccades made during 1000 ms target/distractor presentation (c). First saccade latency for 200 ms and 500 ms SOA delay.
Figure 4.
 
Examples of first fixations heatmaps, where G stands for the gazer, D stands for distractors, T stands for target person, and dashed lines show the gaze direction. Note that we have different versions of gazers for each video that were randomly selected during the experiment. (a) Intact condition, valid cues, (b). Floating heads condition, valid cues (c). headless bodies condition, valid cues, (d). Intact condition, invalid cues, (e). Floating heads condition, invalid cues, (f). Headless bodies condition, invalid cues.
Figure 4.
 
Examples of first fixations heatmaps, where G stands for the gazer, D stands for distractors, T stands for target person, and dashed lines show the gaze direction. Note that we have different versions of gazers for each video that were randomly selected during the experiment. (a) Intact condition, valid cues, (b). Floating heads condition, valid cues (c). headless bodies condition, valid cues, (d). Intact condition, invalid cues, (e). Floating heads condition, invalid cues, (f). Headless bodies condition, invalid cues.
In order to quantify differences among eye movements in the three body parts/whole conditions, fixation maps from each condition were all compared to the benchmark map of perceptual judgments of gaze goals from the intact condition. We consider the perceptual judgments of gaze goals from the intact gazers with no subject time constraints and eye movement restrictions as an upper bound of the perceptually available information about gaze goals. A 1-way ANOVA showed a main effect of body parts/whole condition on the normalized dot product between each of the three fixation maps and the perceptual judgment map from the intact condition, F(2, 177) = 4.31, p = 0.015. FDR-corrected post hoc tests showed a significantly higher normalized dot product between the intact fixation map and intact perceptual judgment map (normalized dot product = 0.37, SE = 0.018) compared to the headless bodies fixation map and the intact perceptual judgment map (normalized dot product = 0.30, SE = 0.018), p = 0.016. In addition, there was a marginally higher dot product of the floating heads fixation map and intact perceptual map than the dot product of the headless bodies fixation map and intact perceptual map (0.35 vs. 0.3), p = 0.08. We found no difference between the normalized dot products (computed with respect to the perceptual judgment intact condition) for the intact fixation map and the floating heads fixation (intact = 0.37, SE = 0.018 versus floating heads = 0.35, SE = 0.018, p = 0.81; Figure 8). These results indicated that the presence of heads led to a stronger effect in guiding eye movements toward the available perceptual information about gaze goals. 
Eye movements strategies across multiple fixations
Our previous fixation analysis focused on the distance of the fixations to the target but did not address the possibility that observers might first fixate on a gazer and then proceed to fixate on the target or a distractor person. We further investigated trials based on fixation locations relative to gazers in the video. For trials when the target person was present, we calculated each fixation's distance to all the gazers and the target person in the video. For each fixation, we only took the distance to the closest gazer to see if the fixation was directed (within 2 degrees of the visual angle) on any of the gazers. Then, we classified the trials into four foveation behaviors: (1) gazer (trials for which the fixations were within 2 degrees of any of the gazers but not the target), (2) target (trials for which the fixations were within 2 degrees of the target but not a gazer), (3) both (trials for which the fixations were within 2 degrees of both the target and a gazer), and (4) neither (trials for which the fixations were not within 2 degrees of the target or any gazer; Figure 9a). 
Then, we further calculated the proportion of each type of foveation behavior based on cue validity to evaluate the cue effects on guiding all fixations (Figure 9b). For the intact condition, we found that the proportion of trials foveating only on the target person or foveating both on the target and gazers was significantly higher for valid cue trials compared to those with invalid cues (foveating only on target, valid = 12.24%, SE = 1.62%, versus invalid = 3.62%, SE = 0.78%, bootstrap resampling, p < 1e-5; foveating on target and gazer, valid = 25.54%, SE = 3.46%, versus invalid = 10.73%, SE = 1.78%, p < 1e-5). However, there were significantly more trials containing fixations only foveating on the gazers when the cue was invalid compared to that of valid cues (invalid = 59.07%, SE = 3.30%, versus valid = 34.46%, SE = 2.52%, p < 1e-5), all p values were corrected by the FDR. 
No difference across valid and invalid trials was found in the proportion of trials containing no foveations on the gazer nor target (valid = 27.76%, SE = 3.08%, versus invalid = 26.58%, SE = 3.53%, p = 0.31), all p values corrected by the FDR. Similar results were found in the floating heads and headless bodies conditions (see Table 4 for detailed results). 
Table 4.
 
Comparison of the proportion of trials across four foveation behaviors (with standard errors in parentheses and p values bold from bootstrap resampling tests) for target-present trials of valid and invalid cue trials. All bootstrap p values were corrected by false discovery rate (FDR). BOLD p values were significant.
Table 4.
 
Comparison of the proportion of trials across four foveation behaviors (with standard errors in parentheses and p values bold from bootstrap resampling tests) for target-present trials of valid and invalid cue trials. All bootstrap p values were corrected by false discovery rate (FDR). BOLD p values were significant.
Similarly, when the target person was absent, we classified the trials into the four foveation behaviors, except that we categorized the trials foveating on the gaze goal (a distractor person) rather than the target (Figure 9c). In the intact condition, a bootstrap test (10,000 samples) showed that the proportion of trials foveating only on the gaze goal person (distractor) was significantly lower compared to trials foveating only on the gazers or foveating on both the gaze goal and gazers. A similar effect was found in the floating heads and the headless bodies conditions. There was a significantly higher proportion of trials foveating neither the gazers nor the gaze goal compared to the trials foveating only the gaze goal, indicating a smaller effect in guiding eye movements with only lower body motion (see Table 5 for detailed results). 
Table 5.
 
Comparison of proportion of trials across four foveation behaviors (with standard errors in parentheses and p values in bold from bootstrap resampling tests) for target-absent trials of valid and invalid cue trials. All bootstrap p values were corrected by false discovery rate (FDR). BOLD p values were significant.
Table 5.
 
Comparison of proportion of trials across four foveation behaviors (with standard errors in parentheses and p values in bold from bootstrap resampling tests) for target-absent trials of valid and invalid cue trials. All bootstrap p values were corrected by false discovery rate (FDR). BOLD p values were significant.
Fixation sequences
Our previous analysis focused on individual fixations. We also analyzed sequences of fixations during the visual search. We defined three types of fixation sequences: (1) look at the gaze goal: the participants only made fixations that were located within 2 degrees region of the gaze goal location; (2) look at the gaze goal, then look back at any of the gazers: participants first fixated within 2 degrees region of the gaze goal location, then fixated within 2 degrees region of any of the gazers; (3) look at the gaze goal, look back at any of the gazers, then search further: participants first fixated within 2 degrees region of the gaze goal location, then within 2 degrees region of any of the gazers, then fixated at locations outside 2 degrees region of any of the gazers to search further for the target person (Table 6). 
Table 6.
 
The proportion of trials that contained different types of fixation order.
Table 6.
 
The proportion of trials that contained different types of fixation order.
Improved behavioral target detection performance with valid gaze cues
We analyzed the effect of gaze cues on target detection performance. A three-way (body parts/whole condition × cue validity × delay) within-subject ANOVA test found that the hit rate for valid gaze cue trials was significantly higher than that for invalid cue trials in all three conditions (F(1, 19) = 26.24, p = 6.1e-05). There was no significant main effect of body parts/whole condition, F(2, 38) = 1.45, p = 0.25, or SOA, F(1, 19) = 0.523, p = 0.48. For simplicity, we reported the results averaged across the two SOAs (Figure 10a; for detailed results see Table 7). Two-way (body parts/whole condition × delay) within-subject ANOVA showed no significant main effect of SOA, F(1, 19) = 1.48, p = 0.24, and no significant main effect of body parts/whole condition on the false positive rates (FPR), F(2, 38) = 0.93, p = 0.41, intact FPR = 19.6%, SE = 3.3%, floating heads FPR = 18.8%, SE = 3.8%, and headless bodies FPR = 17.7, SE = 3.14%. We quantified the behavioral gaze cueing effect by computing an index of detectability, d’, for the valid and invalid gaze cue trials and taking their difference: ∆d’ = d’valid-d’invalid). Multiple paired t-test results showed that d’ in all three body parts/whole conditions were all significantly greater than 0, intact ∆d’ = 0.64, SE = 0.22, p = 0.004, floating heads ∆d’ = 0.36, SE = 0.13, p = 0.001, headless bodies ∆d’ = 0.32, SE = 0.09, p = 0.001, with p values corrected by FDR. However, there was no significant difference in ∆d’ across the three conditions (intact versus floating heads and headless bodies), F(2, 38) = 0.94, p = 0.40 (Figure 10b). 
Table 7.
 
The hit rate (with standard errors in parentheses and p values in bold), separated by cue validity and SOAs and body parts/whole condition. All p values for Tukey post hoc tests were corrected by false discovery rate (FDR). BOLD p values were significant.
Table 7.
 
The hit rate (with standard errors in parentheses and p values in bold), separated by cue validity and SOAs and body parts/whole condition. All p values for Tukey post hoc tests were corrected by false discovery rate (FDR). BOLD p values were significant.
Relationship between behavioral performance and eye movements
We investigated the relationship between the behavioral cueing effect measured by sensitivity (∆d’) and the eye movements cueing effect measured by fixation distance to the target (∆distance). We hypothesized that a larger influence of the gaze cue on the observers’ eye movement guidance toward the target would be related to higher gaze cueing effects on target detection accuracy. Indeed, we found a significant positive correlation r = 0.41, p = 0.0016, indicating that participants who showed a larger target detection difference between valid trails and invalid trials (∆d’ = d’ valid and d' invalid) also showed a larger difference in minimum distance to the target person (∆distance = distance invalid and distance valid; Figure 11a) for all conditions (two data points were identified as outliers and were excluded by correlation analysis). We also assessed the relationship between an observer's behavioral cueing effect and the observer's fixation distance to the gazers. Does an observer's tendency to look closer at gazers result in larger behavioral cueing effects? We found no significant relationship between the ∆fixation distance to the closest gazer and the behavioral cueing effect (∆d’), r = 0.011, p = 0.93 (Figure 11b). 
Discussion
Previous studies have mostly focused on assessing the influence of gaze on covert attention and eye movement with simple drawings or static images (Azarian et al., 2017; Bayliss et al., 2004a; Driver et al., 1999; Friesen & Kingstone, 1998; Hietanen, 2002; Kingstone et al., 2000). In this study, we investigated how movements of heads and bodies of dynamic gaze behaviors embedded in a rich visual environment influence eye movement in a visual search task. We presented natural videos where the gazers were at various eccentricities rather than at the fovea region. Our goal was to simulate a real-life scenario where there is dynamic gaze information and gazers often appear in the visual periphery. For our data set, the gaze cue was non-predictive of the target location to isolate exogenous influences from experiment-specific learned strategies (Droll, Abbey, & Eckstein, 2009; Druker & Anderson, 2010; Geng & Behrmann, 2005). 
The influence of gaze cue validity on eye movement search
We first analyzed how the fixation locations were affected by the gaze cues. Our studies showed an influence of gaze cue validity on eye movement guidance towards the target for the first three saccades. When the gazer/s looked toward the target (valid gaze cue trials) rather than to a distractor (invalid gaze cue trials), fixations were closer to the target, there were a larger proportion of trials with fixations within 2 degrees of the target, and a trend of shorter times for observers’ foveae to fall within 2 degrees of the target. A finer analysis of eye movement strategies showed that observers fixated (within 2 degrees) only on the gazers (and not the target) in a larger proportion of trials when the gaze cue was invalid. A likely explanation is that observers are following the invalid gaze cue and re-fixate the gazers when realizing that the target is not at the gaze goal. 
Contributions of head and lower body cueing to eye movement guidance
We investigated the separate contribution of the head and lower body by digitally deleting either the gazer's head or lower body, o neither (intact condition). We found a benefit of valid gaze cues in all three conditions but with the smallest effect for the headless bodies compared to the other two conditions in the first two fixations (see Figure 5d). Importantly, for the second fixations, the fixation distance to the target converged to a similar value for the intact and the floating heads condition, which was a distance than that second fixation to target distance in the headless bodies condition. These results suggest that head dynamics is the main source of guidance for eye movements. The lower body condition showed a small but significant cueing effect in guiding eye movements closer to the target. The results on eye movement guidance are complementary to a previous study showing the greater cueing influence of the head than the lower body on covert attention and microsaccades when observers maintain fixation during search (Han & Eckstein, 2022). 
Figure 5.
 
(a, b) Distance (degree) between the first and second fixations and the target person in valid and invalid trials. (c) The fixations (first to the fourth) distance to the target in order (fixation index). (d) The ∆distance (invalid-valid) between fixations (first to the fourth) and the target person. All error bars represent standard error across subjects. (e) The proportion of trials for fixation location on the side of the image with the gaze-goal.
Figure 5.
 
(a, b) Distance (degree) between the first and second fixations and the target person in valid and invalid trials. (c) The fixations (first to the fourth) distance to the target in order (fixation index). (d) The ∆distance (invalid-valid) between fixations (first to the fourth) and the target person. All error bars represent standard error across subjects. (e) The proportion of trials for fixation location on the side of the image with the gaze-goal.
Figure 6.
 
(a) The proportion of trials with saccades foveating (within 2 degrees) at the target person's head. (b) The average time it took for participants to foveate (within 2 degrees) the target person's head.
Figure 6.
 
(a) The proportion of trials with saccades foveating (within 2 degrees) at the target person's head. (b) The average time it took for participants to foveate (within 2 degrees) the target person's head.
Figure 7.
 
(a) Perceptual estimation error of the gaze goal location. It is defined as the pixel distance between the centroid of all subjects’ clicks to the ground truth gaze goal location (the head centroid position of the person who was gazed at), averaged across images. (b) Examples of heatmaps of perceptual estimation of gaze location from clicks, regions with a lower density of clicks were green, and regions with higher density were red (b) intact gazers, (c) floating heads gazers, and (d) headless bodies gazers.
Figure 7.
 
(a) Perceptual estimation error of the gaze goal location. It is defined as the pixel distance between the centroid of all subjects’ clicks to the ground truth gaze goal location (the head centroid position of the person who was gazed at), averaged across images. (b) Examples of heatmaps of perceptual estimation of gaze location from clicks, regions with a lower density of clicks were green, and regions with higher density were red (b) intact gazers, (c) floating heads gazers, and (d) headless bodies gazers.
Figure 8.
 
The normalized dot product between fixation maps from three body parts/whole and intact perceptual judgment map (as the benchmark).
Figure 8.
 
The normalized dot product between fixation maps from three body parts/whole and intact perceptual judgment map (as the benchmark).
Figure 9.
 
(a) Example fixations for four foveation behaviors. G stands for the gazer, D stands for distractors, T stands for the target person. The red dots represent the fixation locations and are connected in order. The person with the white dashed box was the gaze goal of all the gazers. Participants always started from the central fixation, and then made fixations to find the target during a 1000 ms presentation time. All the annotations are just for illustration purposes and were not present during the experiment. (b) The proportion of target-present trials that contained fixations foveating on: (1) only any gazers, (2) only the target, (3) both any gazers and the target, and (4) neither the gazers nor the target. (c) The proportion of target-absent trials that contained fixations foveating on: (1) only any gazers, (2) only the gaze goal (distractor), (3) both any gazers and the gaze goal (distractor), and (4) neither the gazers nor the gaze goal (distractor).
Figure 9.
 
(a) Example fixations for four foveation behaviors. G stands for the gazer, D stands for distractors, T stands for the target person. The red dots represent the fixation locations and are connected in order. The person with the white dashed box was the gaze goal of all the gazers. Participants always started from the central fixation, and then made fixations to find the target during a 1000 ms presentation time. All the annotations are just for illustration purposes and were not present during the experiment. (b) The proportion of target-present trials that contained fixations foveating on: (1) only any gazers, (2) only the target, (3) both any gazers and the target, and (4) neither the gazers nor the target. (c) The proportion of target-absent trials that contained fixations foveating on: (1) only any gazers, (2) only the gaze goal (distractor), (3) both any gazers and the gaze goal (distractor), and (4) neither the gazers nor the gaze goal (distractor).
Figure 10.
 
(a) Hit rate in three body parts/whole conditions separated by cue validity averaged across SOAs. (b) Sensitivity difference ∆d’ (valid d’ – invalid d’) in three body parts/whole conditions. All error bars represent standard errors across subjects.
Figure 10.
 
(a) Hit rate in three body parts/whole conditions separated by cue validity averaged across SOAs. (b) Sensitivity difference ∆d’ (valid d’ – invalid d’) in three body parts/whole conditions. All error bars represent standard errors across subjects.
Figure 11.
 
(a) Correlation between the ∆ minimum distance to the target (invalid – valid) to the target and ∆d’ (valid – invalid). (b) Correlation between ∆ distance to the closest gazer (invalid – valid) and ∆d’ (valid – invalid). Each dot represents a single participant in a condition (intact, floating heads, or headless bodies).
Figure 11.
 
(a) Correlation between the ∆ minimum distance to the target (invalid – valid) to the target and ∆d’ (valid – invalid). (b) Correlation between ∆ distance to the closest gazer (invalid – valid) and ∆d’ (valid – invalid). Each dot represents a single participant in a condition (intact, floating heads, or headless bodies).
What might be the reason for the smallest gaze cueing effect for headless bodies? One possibility is that observers cannot extract reliable information about the gazed location from the lower body. To separately quantify the inherent information in the head and lower body that observers can extract to estimate gaze goals, we analyzed the explicit perceptual judgment of gaze direction on static frames from the videos. We showed that observers could accurately estimate gaze goals when the head was present but that estimation errors were large when only the lower body was present. This is consistent with previous results showing that head orientation plays a more important role in gaze perception (Florey, Clifford, Dakin, & Mareschal, 2015). Furthermore, we found that the location distribution of the perceptual judgments on intact images was least similar to the eye movement patterns in the headless bodies videos, and most consistent with those in the intact videos. This indicates that the lesser degree of eye movement guidance with headless bodies in the search task is mediated by observers’ difficulty in extracting gaze information from lower bodies. The gaze information from lower bodies was likely further reduced in the search task because the first saccade decisions are based on gazers appearing in the visual periphery (Loomis et al., 2008; Palanica & Itier, 2014; Yokoyama & Takeda, 2019). 
Gaze cueing and behavioral search accuracy
Our study also showed that the behavioral performance in detecting the target person was improved with valid gaze cues regardless of the type of gaze information (both head and body, head only, or body only). This is consistent with a higher proportion of trials where participants only looked at the gazers and failed to foveate the target person when the gaze cue is invalid. In addition, a higher proportion of trials where participants foveated at the target when the cue was valid. Better behavioral performance with valid gaze cues was likely due to the guidance of eye movement toward the target. The benefit of valid gaze cues on behavioral performance showed no difference across different body parts/whole conditions. This might seem inconsistent with the previous study showing that headless bodies had the smallest cueing effect on the target detection performance (Han & Eckstein, 2022). However, in the Han and Eckstein (2022) study, observers maintained central fixation and were not allowed to execute eye movements. In the current study, we allowed free eye movements after the presentation of the gaze cues. Participants had enough eye movements to make eye movements closer to the target person likely reducing the differences across the three gaze cue conditions. If we had limited the display to two fixations or a shorter presentation time, we would likely obtain differences in target detection performance across body parts/whole conditions. 
We also showed the correlation between the behavior performance in target detection and the fixation accuracy for all participants. A participant with a higher difference in their fixation location between valid and invalid cues tended to have a larger difference also in behavioral performance. This showed that the eye movement pattern was a strong indicator of perceptual decisions, which is consistent with previous studies on tasks such as visual search and face recognition (Chuk, Chan, & Hsiao, 2014; Eckstein, Beutter, Pham, Shimozaki, & Stone, 2007; Koehler & Eckstein, 2017b; Malcolm & Henderson, 2009; Peterson & Eckstein, 2012). 
Table 8 provides a summary of all the main results in our study relating to the influence of gaze cue validity and the presence of the gazer's head. 
Table 8.
 
Results summary. A check represents a significant effect of a factor on the measured variable, a cross represents no significant effect on the measured variable.
Table 8.
 
Results summary. A check represents a significant effect of a factor on the measured variable, a cross represents no significant effect on the measured variable.
Limitations of the current study
There were some limitations to our study. First, the perceptual judgment task estimating gaze goals presented static frames extracted from the videos and allowed participants to free-view the images. The task did not incorporate the information on the movement dynamics of the head and lower body that was present in the videos. The approach for the perceptual judgment gaze estimation task might have underestimated the amount of gaze information in the actual headless body videos. Second, our study design presented the gazers first and the target/distractors after an SOA of 200 or 500 ms to better isolate the effect of the gazer cue. This design might overestimate the influence of the gazer on eye movements. In real-world scenarios, gazers, targets, and distractors are simultaneously present. Thus, the observers might rely less on gaze cues when sensory information about the target is simultaneously available. Future studies should include the presence of the target and distractors during the dynamic gaze behavior to assess how their presence modulates the gaze cue validity effects in the same way target detectability influences the synthetic cue effects (Eckstein et al., 2013; Shimozaki, Eckstein, & Abbey, 2003). 
Third, previous studies have shown that body orientation introduces gaze perception bias when presented together with head orientation (Hietanen, 2002; Moors et al., 2015). In this study, we did not control the relative angle between the heads and bodies during filming. So, it is unknown whether a larger relative angle, or a larger relative motion difference during dynamic gaze behavior would show a larger cueing effect in guiding eye movements. One possible future direction is to explicitly manipulate the head and body motion to test how the integration of head and body motion affects eye movement planning. 
Fourth, our study concentrated on the head/body movement while a large amount of literature focuses on the influences of the gazer's eyes (Driver et al., 1999; Friesen et al., 2004; Langton, Watt, & Bruce, 2000; Mansfield et al., 2003; Ristic, Friesen, & Kingstone, 2002). Our study was relevant to gazers, the target, and distractors situated at a distance from the observers. The mean angle subtended by the heads in our videos (1.39 degrees, STD = 0.29 degrees). Given that the average vertical distance of the head is about 0.24 m (Lee, Hwang Shin, & Istook, 2006), that would match the angle subtended by a real-sized head viewed at a distance of 9.9 m (STD = 1.7 m) in real life. The average vertical length of human eyes spans 2.4 cm (Bekerman, Gottlieb, & Vaiman, 2014). At a 9.9 m distance, the eyes subtend a mean angle of 0.139 degrees (vertically) providing degraded information about gaze goals compared to the head orientation. Future studies should investigate gazers at smaller distances from the observer and assess how dynamic gazers’ eye and head movements are integrated and their interactions similar to some studies using static images of the head and gaze (Balsdon & Clifford, 2018; Cline, 1967; Langton, 2000; Langton, Honeyman, & Tessler, 2004; Otsuka, 2014). 
To summarize, our study extended the gaze cueing effect to search tasks in cluttered scenes and demonstrated the importance of head movements in guiding eye movements and improving target detection performance. 
Acknowledgments
Sponsored by the US Army Research Office and was accomplished under Contract Number W911NF-19-D-0001 for the Institute for Collaborative Biotechnologies. M.P.E. was supported by a Guggenheim Foundation Fellowship. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the US Government. The US Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein. 
Commercial relationships: none. 
Corresponding author: Miguel P. Eckstein. 
Email: migueleckstein@ucsb.edu. 
Address: Department of Psychological and Brain Sciences, Institute for Collaborative Biotechnologies, University of California, 555 University Road, Santa Barbara, CA 93106, USA. 
References
Azarian, B., Buzzell, G. A., Esser, E. G., Dornstauder, A., & Peterson, M. S. (2017). Averted body postures facilitate orienting of the eyes. Acta Psychologica, 175, 28–32, https://doi.org/10.1016/j.actpsy.2017.02.006. [CrossRef] [PubMed]
Balsdon, T., & Clifford, C. (2018). Task dependent effects of head orientation on perceived gaze direction. Frontiers in Psychology, 9, 2491, https://doi.org/10.3389/fpsyg.2018.02491. [CrossRef] [PubMed]
Bayliss, A. P., di Pellegrino, G., & Tipper, S. P. (2004a). Orienting of attention via observed eye gaze is head-centred. Cognition, 94(1), B1–B10, https://doi.org/10.1016/j.cognition.2004.05.002. [CrossRef] [PubMed]
Bayliss, A. P., di Pellegrino, G., & Tipper, S. P. (2004b). Orienting of attention via observed eye gaze is head-centred. Cognition, 94(1), B1–B10, https://doi.org/10.1016/j.cognition.2004.05.002. [CrossRef] [PubMed]
Bekerman, I., Gottlieb, P., & Vaiman, M. (2014). Variations in eyeball diameters of the healthy adults. Journal of Ophthalmology, 2014, 503645, https://doi.org/10.1155/2014/503645. [CrossRef] [PubMed]
Bochud, F. O., Abbey, C. K., & Eckstein, M. P. (2004). Search for lesions in mammograms: Statistical characterization of observer responses. Medical Physics, 31(1), 24–36. [CrossRef] [PubMed]
Bräuer, J., Call, J., & Tomasello, M. (2005). All great ape species follow gaze to distant locations and around barriers. Journal of Comparative Psychology, 119(2), 145–154, https://doi.org/10.1037/0735-7036.119.2.145. [CrossRef]
Bravo, M. J., & Farid, H. (2009). The specificity of the search template. Journal of Vision, 9(1), 34.1–34.9, https://doi.org/10.1167/9.1.34. [CrossRef] [PubMed]
Brooks, R., & Meltzoff, A. N. (2005). The development of gaze following and its relation to language. Developmental Science, 8(6), 535–543, https://doi.org/10.1111/j.1467-7687.2005.00445.x. [CrossRef] [PubMed]
Burgess, A. E., & Ghandeharian, H. (1984). Visual signal detection. II. Signal-location identification. Journal of the Optical Society of America. A, Optics and Image Science, 1(8), 906–910. [CrossRef] [PubMed]
Castelhano, M. S., & Heaven, C. (2010). The relative contribution of scene context and target features to visual search in scenes. Attention, Perception & Psychophysics, 72(5), 1283–1297, https://doi.org/10.3758/APP.72.5.1283. [PubMed]
Cheal, M., & Lyon, D. R. (1991). Central and peripheral precuing of forced-choice discrimination. The Quarterly Journal of Experimental Psychology Section A, 43(4), 859–880, https://doi.org/10.1080/14640749108400960.
Chuk, T., Chan, A. B., & Hsiao, J. H. (2014). Understanding eye movements in face recognition using hidden Markov models. Journal of Vision, 14(11), 8, https://doi.org/10.1167/14.11.8. [PubMed]
Cline, M. G. (1967). The perception of where a person is looking. The American Journal of Psychology, 80(1), 41–50. [PubMed]
Deza, A., & Eckstein, M. (2016). Can peripheral representations improve clutter metrics on complex scenes? [published online August 14, 2016, ahead of print] Advances in Neural Information Processing Systems, arXiv preprint. 2847–2855, https://doi.org./10.48550/arXiv.1608.04042.
Driver, J., Davis, G., Ricciardelli, P., Kidd, P., Maxwell, E., & Baron-Cohen, S. (1999). Gaze perception triggers reflexive visuospatial orienting. Visual Cognition, 6(5), 509–540.
Droll, J. A., Abbey, C. K., & Eckstein, M. P. (2009). Learning cue validity through performance feedback. Journal of Vision, 9(2), 18.1–18.23, https://doi.org/10.1167/9.2.18. [PubMed]
Druker, M., & Anderson, B. (2010). Spatial probability aids visual stimulus discrimination. Frontiers in Human Neuroscience, 4, 63, https://doi.org/10.3389/fnhum.2010.00063. [PubMed]
Eckstein, M. (2017). Probabilistic computations for attention, eye movements, and search. Annual Review of Vision Science, 3, 18.1–18.24.
Eckstein, M. P., Beutter, B. R., Pham, B. T., Shimozaki, S. S., & Stone, L. S. (2007). Similar neural representations of the target for saccades and perception during search. The Journal of Neuroscience, 27(6), 1266–1270, https://doi.org/10.1523/JNEUROSCI.3975-06.2007.
Eckstein, M. P., Beutter, B. R., & Stone, L. S. (2001). Quantifying the performance limits of human saccadic targeting during visual search. Perception, 30(11), 1389–1401. [PubMed]
Eckstein, M. P., Drescher, B. A., & Shimozaki, S. S. (2006). Attentional cues in real scenes, saccadic targeting, and Bayesian priors. Psychological Science, 17(11), 973–980, https://doi.org/10.1111/j.1467-9280.2006.01815.x. [PubMed]
Eckstein, M. P., Mack, S. C., Liston, D. B., Bogush, L., Menzel, R., & Krauzlis, R. J. (2013). Rethinking human visual attention: Spatial cueing effects and optimality of decisions by honeybees, monkeys and humans. Vision Research, 7(85), 5–19, https://doi.org/10.1016/j.visres.2012.12.011.
Emery, N. J. (2000). The eyes have it: The neuroethology, function and evolution of social gaze. Neuroscience & Biobehavioral Reviews, 24(6), 581–604, https://doi.org/16/S0149-7634(00)00025-7.
Findlay, J. M. (1997). Saccade target selection during visual search. Vision Research, 37(5), 617–631, https://doi.org/10.1016/S0042-6989(96)00218-0. [PubMed]
Florey, J., Clifford, C. W. G., Dakin, S. C., & Mareschal, I. (2015). Peripheral processing of gaze. Journal of Experimental Psychology: Human Perception and Performance, 41(4), 1084–1094, https://doi.org/10.1037/xhp0000068. [PubMed]
Friesen, C. K., & Kingstone, A. (1998). The eyes have it! Reflexive orienting is triggered by nonpredictive gaze. Psychonomic Bulletin & Review, 5(3), 490–495, https://doi.org/10.3758/BF03208827.
Friesen, C. K., & Kingstone, A. (2003). Covert and overt orienting to gaze direction cues and the effects of fixation offset. NeuroReport: For Rapid Communication of Neuroscience Research, 14(3), 489–493, https://doi.org/10.1097/00001756-200303030-00039.
Friesen, C. K., Ristic, J., & Kingstone, A. (2004). Attentional effects of counterpredictive gaze and arrow cues. Journal of Experimental Psychology. Human Perception and Performance, 30(2), 319–329, https://doi.org/10.1037/0096-1523.30.2.319. [PubMed]
Frischen, A., Bayliss, A. P., & Tipper, S. P. (2007). Gaze cueing of attention: Visual attention, social cognition, and individual differences. Psychological Bulletin, 133(4), 694–724, https://doi.org/10.1037/0033-2909.133.4.694. [PubMed]
Geng, J. J., & Behrmann, M. (2005). Spatial probability as an attentional cue in visual search. Perception & Psychophysics, 67(7), 1252–1268. [PubMed]
Gregory, S. (2021). Investigating facilitatory versus inhibitory effects of dynamic social and non-social cues on attention in a realistic space. Psychological Research, 86(5), 1578–1590, https://doi.org/10.1007/s00426-021-01574-7. [PubMed]
Han, N. X., & Eckstein, M. P. (2022). Gaze-cued shifts of attention and microsaccades are sustained for whole bodies but are transient for body parts. Psychonomic Bulletin & Review, 29(5), 1854–1878. [PubMed]
Henderson, J. M., Chanceaux, M., & Smith, T. J. (2009). The influence of clutter on real-world scene search: Evidence from search efficiency and eye movements. Journal of Vision, 9(1), 32.1-8, https://doi.org/10.1167/9.1.32.
Hermens, F., & Walker, R. (2012). Do you look where I look? Attention shifts and response preparation following dynamic social cues. Journal of Eye Movement Research, 5(5), Article 5, 1–11, https://doi.org/10.16910/jemr.5.5.5.
Hietanen, J. K. (1999). Does your gaze direction and head orientation shift my visual attention? Neuroreport, 10(16), 3443–3447, https://doi.org/10.1097/00001756-199911080-00033. [PubMed]
Hietanen, J. K. (2002). Social attention orienting integrates visual information from head and body orientation. Psychological Research, 66(3), 174–179, https://doi.org/10.1007/s00426-002-0091-8. [PubMed]
Hood, B. M., Willen, J. D., & Driver, J. (1998). Adult's eyes trigger shifts of visual attention in human infants. Psychological Science, 9(2), 131–134, https://doi.org/10.1111/1467-9280.00024.
Kaminski, J., Riedel, J., Call, J., & Tomasello, M. (2005). Domestic goats, Capra hircus, follow gaze direction and use social cues in an object choice task. Animal Behaviour, 69(1), 11–18, https://doi.org/10.1016/j.anbehav.2004.05.008.
Kingstone, A., Friesen, C., & Gazzaniga, M. (2000). Reflexive joint attention depends on lateralized cortical connections. Psychological Science, 11, 159–166, https://doi.org/10.1111/1467-9280.00232. [PubMed]
Kleinke, C. L. (1986). Gaze and eye contact: A research review. Psychological Bulletin, 100(1), 78–100, https://doi.org/10.1037/0033-2909.100.1.78. [PubMed]
Koehler, K., & Eckstein, M. (2017a). Beyond scene gist: Objects guide search more than backgrounds. Journal of Experimental Psychology. Human Perception and Performance, 43(6), 1177–1193. [PubMed]
Koehler, K., & Eckstein, M. P. (2017b). Temporal and peripheral extraction of contextual cues from scenes during visual search. Journal of Vision, 17(2), 16, https://doi.org/10.1167/17.2.16. [PubMed]
Kuhn, G., & Tipples, J. (2011). Increased gaze following for fearful faces. It depends on what you're looking for! Psychonomic Bulletin & Review, 18(1), 89–95, https://doi.org/10.3758/s13423-010-0033-1. [PubMed]
Lago, M. A., Jonnalagadda, A., Abbey, C. K., Barufaldi, B. B., Bakic, P. R., Maidment, A. D. A., & Eckstein, M. P. (2021). Under-exploration of three-dimensional images leads to search errors for small salient targets. Current Biology: CB, 31(5), 1099–1106.e5, https://doi.org/10.1016/j.cub.2020.12.029. [PubMed]
Lago, M. A., Sechopoulos, I., Bochud, F. O., & Eckstein, M. P. (2020). Measurement of the useful field of view for single slices of different imaging modalities and targets. Journal of Medical Imaging (Bellingham, Wash.), 7(2), 022411, https://doi.org/10.1117/1.JMI.7.2.022411. [PubMed]
Langton, S. R. H. (2000). The mutual influence of gaze and head orientation in the analysis of social attention direction. The Quarterly Journal of Experimental Psychology Section A, 53(3), 825–845, https://doi.org/10.1080/713755908.
Langton, S. R. H., Honeyman, H., & Tessler, E. (2004). The influence of head contour and nose angle on the perception of eye-gaze direction. Perception & Psychophysics, 66(5), 752–771, https://doi.org/10.3758/BF03194970. [PubMed]
Langton, S. R. H., Watt, R. J., & Bruce, V. (2000). Do the eyes have it? Cues to the direction of social attention. Trends in Cognitive Sciences, 4(2), 50–59, https://doi.org/10.1016/S1364-6613(99)01436-9. [PubMed]
Lee, J.-H., Hwang Shin, S.-J., & Istook, C. L. (2006). Analysis of human head shapes in the United States. International Journal of Human Ecology, 7(1), 77–83.
Loomis, J. M., Kelly, J. W., Pusch, M., Bailenson, J. N., & Beall, A. C. (2008). Psychophysics of perceiving eye-gaze and head direction with peripheral vision: Implications for the dynamics of eye-gaze behavior. Perception, 37(9), 1443–1457, https://doi.org/10.1068/p5896. [PubMed]
Malcolm, G. L., & Henderson, J. M. (2009). The effects of target template specificity on visual search in real-world scenes: Evidence from eye movements. Journal of Vision, 9(11), 8.1–8.13, https://doi.org/10.1167/9.11.8. [PubMed]
Mansfield, E., Farroni, T., & Johnson, M. (2003). Does gaze perception facilitate overt orienting? Visual Cognition, 10(1), 7–14, https://doi.org/10.1080/713756671.
Maylor, E. A., & Hockey, R. (1985). Inhibitory component of externally controlled covert orienting in visual space. Journal of Experimental Psychology. Human Perception and Performance, 11(6), 777–787, https://doi.org/10.1037//0096-1523.11.6.777. [PubMed]
McKee, D., Christie, J., & Klein, R. (2007). On the uniqueness of attentional capture by uninformative gaze cues: Facilitation interacts with the Simon effect and is rarely followed by IOR. Canadian Journal of Experimental Psychology/Revue Canadienne de Psychologie Expérimentale, 61(4), 293–303, https://doi.org/10.1037/cjep2007029. [PubMed]
Michel, M., & Geisler, W. S. (2011). Intrinsic position uncertainty explains detection and localization performance in peripheral vision. Journal of Vision, 11(1), 18, https://doi.org/10.1167/11.1.18. [PubMed]
Mitroff, S. R., & Biggs, A. T. (2014). The ultra-rare-item effect: Visual search for exceedingly rare items is highly susceptible to error. Psychological Science, 25(1), 284–289, https://doi.org/10.1177/0956797613504221. [PubMed]
Moors, P., Germeys, F., Pomianowska, I., & Verfaillie, K. (2015). Perceiving where another person is looking: The integration of head and body information in estimating another person's gaze. Frontiers in Psychology, 6, 909, https://www.frontiersin.org/articles/10.3389/fpsyg.2015.00909. [PubMed]
Morvan, C., & Maloney, L. T. (2012). Human visual search does not maximize the post-saccadic probability of identifying targets. PLoS Computational Biology, 8(2), e1002342, https://doi.org/10.1371/journal.pcbi.1002342. [PubMed]
Mulckhuyse, M., & Theeuwes, J. (2010). Unconscious attentional orienting to exogenous cues: A review of the literature. Acta Psychologica, 134(3), 299–309, https://doi.org/10.1016/j.actpsy.2010.03.002. [PubMed]
Müller, H. J., & Findlay, J. M. (1988). The effect of visual attention of peripheral discrimination thresholds in single and multiple element displays. Acta Psychologica, 69(2), 129–155, https://doi.org/10.1016/0001-6918(88)90003-0. [PubMed]
Müller, H. J., & Rabbitt, P. M. (1989). Reflexive and voluntary orienting of visual attention: Time course of activation and resistance to interruption. Journal of Experimental Psychology. Human Perception and Performance, 15(2), 315–330, https://doi.org/10.1037//0096-1523.15.2.315. [PubMed]
Nagy, A. L., Neriani, K. E., & Young, T. L. (2005). Effects of target and distractor heterogeneity on search for a color target. Vision Research, 45(14), 1885–1899, https://doi.org/10.1016/j.visres.2005.01.007. [PubMed]
Nakayama, K., & Mackeben, M. (1989). Sustained and transient components of focal visual attention. Vision Research, 29(11), 1631–1647. [PubMed]
Neider, M. B., & Zelinsky, G. J. (2006). Scene context guides eye movements during visual search. Vision Research, 46(5), 614–621, https://doi.org/16/j.visres.2005.08.025. [PubMed]
Otsuka, Y. (2014). Dual-route model of the effect of head orientation on perceived gaze direction. Journal of Experimental Psychology: Human Perception and Performance. Washington, DC: US American Psychological Association, https://doi.org/10.1037/a0036151.
Palanica, A., & Itier, R. J. (2014). Effects of peripheral eccentricity and head orientation on gaze discrimination. Visual Cognition, 22(9–10), 1216–1232, https://doi.org/10.1080/13506285.2014.990545. [PubMed]
Palmer, J., Verghese, P., & Pavel, M. (2000). The psychophysics of visual search. Vision Research, 40(10–12), 1227–1268. [PubMed]
Peterson, M. F., & Eckstein, M. P. (2012). Looking just below the eyes is optimal across face recognition tasks. Proceedings of the National Academy of Sciences of the United States of America, 109(48), E3314–E3323, https://doi.org/10.1073/pnas.1214269109. [PubMed]
Ricciardelli, P., Bricolo, E., Aglioti, S., & Chelazzi, L. (2003). My eyes want to look where your eyes are looking: Exploring the tendency to imitate another individual's gaze. Neuroreport, 13, 2259–2264, https://doi.org/10.1097/01.wnr.0000044227.79663.2e.
Ristic, J., Friesen, C. K., & Kingstone, A. (2002). Are eyes special? It depends on how you look at it. Psychonomic Bulletin & Review, 9(3), 507–513, https://doi.org/10.3758/bf03196306. [PubMed]
Ristic, J., Wright, A., & Kingstone, A. (2007). Attentional control and reflexive orienting to gaze and arrow cues. Psychonomic Bulletin & Review, 14(5), 964–969, https://doi.org/10.3758/BF03194129. [PubMed]
Rosenholtz, R. (2016). Capabilities and limitations of peripheral vision. Annual Review of Vision Science, 2, 437–457. [PubMed]
Rutherford, M. D., & Krysko, K. M. (2008). Eye direction, not movement direction, predicts attention shifts in those with autism spectrum disorders. Journal of Autism and Developmental Disorders, 38(10), 1958–1965, https://doi.org/10.1007/s10803-008-0592-4. [PubMed]
Semizer, Y., & Michel, M. M. (2017). Intrinsic position uncertainty impairs overt search performance. Journal of Vision, 17(9), 13, https://doi.org/10.1167/17.9.13. [PubMed]
Senju, A., & Csibra, G. (2008). Gaze following in human infants depends on communicative signals. Current Biology, 18(9), 668–671, https://doi.org/10.1016/j.cub.2008.03.059.
Shepherd, M., & Müller, H. J. (1989). Movement versus focusing of visual attention. Perception & Psychophysics, 46(2), 146–154, https://doi.org/10.3758/BF03204974. [PubMed]
Shepherd, S. V. (2010). Following gaze: Gaze-following behavior as a window into social cognition. Frontiers in Integrative Neuroscience, 4, 5, https://doi.org/10.3389/fnint.2010.00005. [PubMed]
Shi, J., Weng, X., He, S., & Jiang, Y. (2010). Biological motion cues trigger reflexive attentional orienting. Cognition, 117(3), 348–354, https://doi.org/10.1016/j.cognition.2010.09.001. [PubMed]
Shimozaki, S. S., Eckstein, M. P., & Abbey, C. K. (2003). Comparison of two weighted integration models for the cueing task: Linear and likelihood. Journal of Vision, 3(3), 209–229, https://doi.org/10:1167/3.3.3. [PubMed]
Shimozaki, S. S., Schoonveld, W. A., & Eckstein, M. P. (2012). A unified Bayesian observer analysis for set size and cueing effects on perceptual decisions and saccades. Journal of Vision, 12(6), 27, https://doi.org/10.1167/12.6.27. [PubMed]
Strasburger, H., Rentschler, I., & Jüttner, M. (2011). Peripheral vision and pattern recognition: A review. Journal of Vision, 11(5), 13, https://doi.org/10.1167/11.5.13. [PubMed]
Sun, Y., Stein, T., Liu, W., Ding, X., & Nie, Q.-Y. (2017). Biphasic attentional orienting triggered by invisible social signals. Cognition, 168, 129–139, https://doi.org/10.1016/j.cognition.2017.06.020. [PubMed]
Swensson, R. G., & Judy, P. F. (1981). Detection of noisy visual targets: Models for the effects of spatial uncertainty and signal-to-noise ratio. Perception & Psychophysics, 29(6), 521–534. [PubMed]
Theeuwes, J. (1991). Exogenous and endogenous control of attention: The effect of visual onsets and offsets. Perception & Psychophysics, 49(1), 83–90. [PubMed]
Torralba, A., Oliva, A., Castelhano, M. S., & Henderson, J. M. (2006). Contextual guidance of eye movements and attention in real-world scenes: The role of global features in object search. Psychological Review, 113(4), 766–786, https://doi.org/10.1037/0033-295X.113.4.766. [PubMed]
Verghese, P. (2012). Active search for multiple targets is inefficient. Vision Research, 74, 61–71, https://doi.org/10.1016/j.visres.2012.08.008. [PubMed]
Võ, M. L.-H. (2021). The meaning and structure of scenes. Vision Research, 181, 10–20, https://doi.org/10.1016/j.visres.2020.11.003. [PubMed]
Võ, M. L.-H., Boettcher, S. E., & Draschkow, D. (2019). Reading scenes: How scene grammar guides attention and aids perception in real-world environments. Current Opinion in Psychology, 29, 205–210, https://doi.org/10.1016/j.copsyc.2019.03.009. [PubMed]
Wallis, L. J., Range, F., Müller, C. A., Serisier, S., Huber, L., & Virányi, Z. (2015). Training for eye contact modulates gaze following in dogs. Animal Behaviour, 106, 27–35, https://doi.org/10.1016/j.anbehav.2015.04.020. [PubMed]
Wang, L., Yang, X., Shi, J., & Jiang, Y. (2014). The feet have it: Local biological motion cues trigger reflexive attentional orienting in the brain. NeuroImage, 84, 217–224, https://doi.org/10.1016/j.neuroimage.2013.08.041. [PubMed]
Wolfe, J. M., Horowitz, T. S., & Kenner, N. M. (2005). Cognitive psychology: Rare items often missed in visual searches. Nature, 435(7041), 439–440, https://doi.org/10.1038/435439a. [PubMed]
Wolfe, J. M., Võ, M. L.-H., Evans, K. K., & Greene, M. R. (2011). Visual search in scenes involves selective and nonselective pathways. Trends in Cognitive Sciences, 15(2), 77–84, https://doi.org/10.1016/j.tics.2010.12.001. [PubMed]
Yokoyama, T., & Takeda, Y. (2019). Gaze cuing effects in peripheral vision. Frontiers in Psychology, 10, 708, https://doi.org/10.3389/fpsyg.2019.00708. [PubMed]
Appendix
Figure A1.
 
Example frames from three conditions. Gazers (G), Target (T), Distractor (D). (a-c) Target present with an invalid cue: Gazers (G) gaze at the location with the target (T) and two distractors (D) showing up after a 200 ms or 500 ms delay. In this case, a distractor (D) instead of the target (T) shows up at the gaze goal location, so the cue is invalid. (d-f) Target absent: Gazers (G) gaze at the location with three distractors (D) showing up after a 200 ms or 500 ms delay.
Figure A1.
 
Example frames from three conditions. Gazers (G), Target (T), Distractor (D). (a-c) Target present with an invalid cue: Gazers (G) gaze at the location with the target (T) and two distractors (D) showing up after a 200 ms or 500 ms delay. In this case, a distractor (D) instead of the target (T) shows up at the gaze goal location, so the cue is invalid. (d-f) Target absent: Gazers (G) gaze at the location with three distractors (D) showing up after a 200 ms or 500 ms delay.
Figure A2.
 
The proportion of trials foveating at the target (within 3 degrees of the visual angle).
Figure A2.
 
The proportion of trials foveating at the target (within 3 degrees of the visual angle).
Figure A3.
 
Normalized dot product permutation tests in three conditions.
Figure A3.
 
Normalized dot product permutation tests in three conditions.
Figure 1.
 
Example frames from videos with a valid gaze cue. In the videos, the gazers (G) looked at the same designated person (gaze goal). After 200 ms or 500 ms, the target person (T) and some other distractor persons (D) appeared in the video. The letter notations were not included in the actual video stimuli during the experiment and are only presented here for illustration purposes. (a) Intact condition: gazer (G) is intact (b) floating heads condition: gazer (G) has floating heads without a body. (c) Headless body condition: gazer (G) has headless bodies. Invalid gaze cue videos were similar except that the gazers looked at a location where a distractor individual rather than the target was presented.
Figure 1.
 
Example frames from videos with a valid gaze cue. In the videos, the gazers (G) looked at the same designated person (gaze goal). After 200 ms or 500 ms, the target person (T) and some other distractor persons (D) appeared in the video. The letter notations were not included in the actual video stimuli during the experiment and are only presented here for illustration purposes. (a) Intact condition: gazer (G) is intact (b) floating heads condition: gazer (G) has floating heads without a body. (c) Headless body condition: gazer (G) has headless bodies. Invalid gaze cue videos were similar except that the gazers looked at a location where a distractor individual rather than the target was presented.
Figure 2.
 
(a) Photograph of the target person and an accompanying close-up shot. Participants were shown the target person's photos on the response screen. (b) Timeline for each trial. The participants were required to fixate at the center cross and press the space bar to initialize the trial. The video started with one or multiple gazers looking at the same person. Once the gazer's head/body movement ended, the central cross disappeared, and observers were free to make eye movements. Either 200 ms or 500 ms after the disappearance of the central cross, other individuals (target with distractors or distractors only) appeared in the video for 1000 ms. Participants indicated whether the target person was present or absent in the video. Observers were free to execute eye movements and were given no instructions related to search strategies.
Figure 2.
 
(a) Photograph of the target person and an accompanying close-up shot. Participants were shown the target person's photos on the response screen. (b) Timeline for each trial. The participants were required to fixate at the center cross and press the space bar to initialize the trial. The video started with one or multiple gazers looking at the same person. Once the gazer's head/body movement ended, the central cross disappeared, and observers were free to make eye movements. Either 200 ms or 500 ms after the disappearance of the central cross, other individuals (target with distractors or distractors only) appeared in the video for 1000 ms. Participants indicated whether the target person was present or absent in the video. Observers were free to execute eye movements and were given no instructions related to search strategies.
Figure 3.
 
(a) The frequency of the first saccade starting time relative to the time the gazer's head movement stopped (b). Histogram of the number of saccades made during 1000 ms target/distractor presentation (c). First saccade latency for 200 ms and 500 ms SOA delay.
Figure 3.
 
(a) The frequency of the first saccade starting time relative to the time the gazer's head movement stopped (b). Histogram of the number of saccades made during 1000 ms target/distractor presentation (c). First saccade latency for 200 ms and 500 ms SOA delay.
Figure 4.
 
Examples of first fixations heatmaps, where G stands for the gazer, D stands for distractors, T stands for target person, and dashed lines show the gaze direction. Note that we have different versions of gazers for each video that were randomly selected during the experiment. (a) Intact condition, valid cues, (b). Floating heads condition, valid cues (c). headless bodies condition, valid cues, (d). Intact condition, invalid cues, (e). Floating heads condition, invalid cues, (f). Headless bodies condition, invalid cues.
Figure 4.
 
Examples of first fixations heatmaps, where G stands for the gazer, D stands for distractors, T stands for target person, and dashed lines show the gaze direction. Note that we have different versions of gazers for each video that were randomly selected during the experiment. (a) Intact condition, valid cues, (b). Floating heads condition, valid cues (c). headless bodies condition, valid cues, (d). Intact condition, invalid cues, (e). Floating heads condition, invalid cues, (f). Headless bodies condition, invalid cues.
Figure 5.
 
(a, b) Distance (degree) between the first and second fixations and the target person in valid and invalid trials. (c) The fixations (first to the fourth) distance to the target in order (fixation index). (d) The ∆distance (invalid-valid) between fixations (first to the fourth) and the target person. All error bars represent standard error across subjects. (e) The proportion of trials for fixation location on the side of the image with the gaze-goal.
Figure 5.
 
(a, b) Distance (degree) between the first and second fixations and the target person in valid and invalid trials. (c) The fixations (first to the fourth) distance to the target in order (fixation index). (d) The ∆distance (invalid-valid) between fixations (first to the fourth) and the target person. All error bars represent standard error across subjects. (e) The proportion of trials for fixation location on the side of the image with the gaze-goal.
Figure 6.
 
(a) The proportion of trials with saccades foveating (within 2 degrees) at the target person's head. (b) The average time it took for participants to foveate (within 2 degrees) the target person's head.
Figure 6.
 
(a) The proportion of trials with saccades foveating (within 2 degrees) at the target person's head. (b) The average time it took for participants to foveate (within 2 degrees) the target person's head.
Figure 7.
 
(a) Perceptual estimation error of the gaze goal location. It is defined as the pixel distance between the centroid of all subjects’ clicks to the ground truth gaze goal location (the head centroid position of the person who was gazed at), averaged across images. (b) Examples of heatmaps of perceptual estimation of gaze location from clicks, regions with a lower density of clicks were green, and regions with higher density were red (b) intact gazers, (c) floating heads gazers, and (d) headless bodies gazers.
Figure 7.
 
(a) Perceptual estimation error of the gaze goal location. It is defined as the pixel distance between the centroid of all subjects’ clicks to the ground truth gaze goal location (the head centroid position of the person who was gazed at), averaged across images. (b) Examples of heatmaps of perceptual estimation of gaze location from clicks, regions with a lower density of clicks were green, and regions with higher density were red (b) intact gazers, (c) floating heads gazers, and (d) headless bodies gazers.
Figure 8.
 
The normalized dot product between fixation maps from three body parts/whole and intact perceptual judgment map (as the benchmark).
Figure 8.
 
The normalized dot product between fixation maps from three body parts/whole and intact perceptual judgment map (as the benchmark).
Figure 9.
 
(a) Example fixations for four foveation behaviors. G stands for the gazer, D stands for distractors, T stands for the target person. The red dots represent the fixation locations and are connected in order. The person with the white dashed box was the gaze goal of all the gazers. Participants always started from the central fixation, and then made fixations to find the target during a 1000 ms presentation time. All the annotations are just for illustration purposes and were not present during the experiment. (b) The proportion of target-present trials that contained fixations foveating on: (1) only any gazers, (2) only the target, (3) both any gazers and the target, and (4) neither the gazers nor the target. (c) The proportion of target-absent trials that contained fixations foveating on: (1) only any gazers, (2) only the gaze goal (distractor), (3) both any gazers and the gaze goal (distractor), and (4) neither the gazers nor the gaze goal (distractor).
Figure 9.
 
(a) Example fixations for four foveation behaviors. G stands for the gazer, D stands for distractors, T stands for the target person. The red dots represent the fixation locations and are connected in order. The person with the white dashed box was the gaze goal of all the gazers. Participants always started from the central fixation, and then made fixations to find the target during a 1000 ms presentation time. All the annotations are just for illustration purposes and were not present during the experiment. (b) The proportion of target-present trials that contained fixations foveating on: (1) only any gazers, (2) only the target, (3) both any gazers and the target, and (4) neither the gazers nor the target. (c) The proportion of target-absent trials that contained fixations foveating on: (1) only any gazers, (2) only the gaze goal (distractor), (3) both any gazers and the gaze goal (distractor), and (4) neither the gazers nor the gaze goal (distractor).
Figure 10.
 
(a) Hit rate in three body parts/whole conditions separated by cue validity averaged across SOAs. (b) Sensitivity difference ∆d’ (valid d’ – invalid d’) in three body parts/whole conditions. All error bars represent standard errors across subjects.
Figure 10.
 
(a) Hit rate in three body parts/whole conditions separated by cue validity averaged across SOAs. (b) Sensitivity difference ∆d’ (valid d’ – invalid d’) in three body parts/whole conditions. All error bars represent standard errors across subjects.
Figure 11.
 
(a) Correlation between the ∆ minimum distance to the target (invalid – valid) to the target and ∆d’ (valid – invalid). (b) Correlation between ∆ distance to the closest gazer (invalid – valid) and ∆d’ (valid – invalid). Each dot represents a single participant in a condition (intact, floating heads, or headless bodies).
Figure 11.
 
(a) Correlation between the ∆ minimum distance to the target (invalid – valid) to the target and ∆d’ (valid – invalid). (b) Correlation between ∆ distance to the closest gazer (invalid – valid) and ∆d’ (valid – invalid). Each dot represents a single participant in a condition (intact, floating heads, or headless bodies).
Figure A1.
 
Example frames from three conditions. Gazers (G), Target (T), Distractor (D). (a-c) Target present with an invalid cue: Gazers (G) gaze at the location with the target (T) and two distractors (D) showing up after a 200 ms or 500 ms delay. In this case, a distractor (D) instead of the target (T) shows up at the gaze goal location, so the cue is invalid. (d-f) Target absent: Gazers (G) gaze at the location with three distractors (D) showing up after a 200 ms or 500 ms delay.
Figure A1.
 
Example frames from three conditions. Gazers (G), Target (T), Distractor (D). (a-c) Target present with an invalid cue: Gazers (G) gaze at the location with the target (T) and two distractors (D) showing up after a 200 ms or 500 ms delay. In this case, a distractor (D) instead of the target (T) shows up at the gaze goal location, so the cue is invalid. (d-f) Target absent: Gazers (G) gaze at the location with three distractors (D) showing up after a 200 ms or 500 ms delay.
Figure A2.
 
The proportion of trials foveating at the target (within 3 degrees of the visual angle).
Figure A2.
 
The proportion of trials foveating at the target (within 3 degrees of the visual angle).
Figure A3.
 
Normalized dot product permutation tests in three conditions.
Figure A3.
 
Normalized dot product permutation tests in three conditions.
Table 1.
 
All fixation distance to the target summary (with standard errors across subjects in parentheses and p values below the standard errors). All p values for Tukey post hoc tests were corrected using false discovery rate (FDR). BOLD p values were significant.
Table 1.
 
All fixation distance to the target summary (with standard errors across subjects in parentheses and p values below the standard errors). All p values for Tukey post hoc tests were corrected using false discovery rate (FDR). BOLD p values were significant.
Table 2.
 
Mean ∆distance (invalid-valid) to the target for all the conditions and fixations, with standard errors across participants, and p values for comparisons against 0 (corrected by FDR). BOLD p values were significant.
Table 2.
 
Mean ∆distance (invalid-valid) to the target for all the conditions and fixations, with standard errors across participants, and p values for comparisons against 0 (corrected by FDR). BOLD p values were significant.
Table 3.
 
The proportion of trials where 1st fixation was located at the same side (left/right) as the gaze goal, and bootstrap p values for comparisons against 50% (corrected by FDR). BOLD p values were significant.
Table 3.
 
The proportion of trials where 1st fixation was located at the same side (left/right) as the gaze goal, and bootstrap p values for comparisons against 50% (corrected by FDR). BOLD p values were significant.
Table 4.
 
Comparison of the proportion of trials across four foveation behaviors (with standard errors in parentheses and p values bold from bootstrap resampling tests) for target-present trials of valid and invalid cue trials. All bootstrap p values were corrected by false discovery rate (FDR). BOLD p values were significant.
Table 4.
 
Comparison of the proportion of trials across four foveation behaviors (with standard errors in parentheses and p values bold from bootstrap resampling tests) for target-present trials of valid and invalid cue trials. All bootstrap p values were corrected by false discovery rate (FDR). BOLD p values were significant.
Table 5.
 
Comparison of proportion of trials across four foveation behaviors (with standard errors in parentheses and p values in bold from bootstrap resampling tests) for target-absent trials of valid and invalid cue trials. All bootstrap p values were corrected by false discovery rate (FDR). BOLD p values were significant.
Table 5.
 
Comparison of proportion of trials across four foveation behaviors (with standard errors in parentheses and p values in bold from bootstrap resampling tests) for target-absent trials of valid and invalid cue trials. All bootstrap p values were corrected by false discovery rate (FDR). BOLD p values were significant.
Table 6.
 
The proportion of trials that contained different types of fixation order.
Table 6.
 
The proportion of trials that contained different types of fixation order.
Table 7.
 
The hit rate (with standard errors in parentheses and p values in bold), separated by cue validity and SOAs and body parts/whole condition. All p values for Tukey post hoc tests were corrected by false discovery rate (FDR). BOLD p values were significant.
Table 7.
 
The hit rate (with standard errors in parentheses and p values in bold), separated by cue validity and SOAs and body parts/whole condition. All p values for Tukey post hoc tests were corrected by false discovery rate (FDR). BOLD p values were significant.
Table 8.
 
Results summary. A check represents a significant effect of a factor on the measured variable, a cross represents no significant effect on the measured variable.
Table 8.
 
Results summary. A check represents a significant effect of a factor on the measured variable, a cross represents no significant effect on the measured variable.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×