To assess performance searching for an individual walking in a crowd, we developed a VR-based simulation referred to as the “virtual hallway.” The simulation was a rendering of a hallway of a typical school with a crowd of people walking around the observer. The scene was presented in a dynamic, continuous fashion and viewed from a fixed, first-person perspective. Participants were instructed to search, locate, and pursue a specific target individual (the principal of the fictitious school) walking in a crowded hallway as soon as that individual appeared from one of eight possible entrances. Each participant then tracked the target's path until the individual was no longer visible on the screen (for a demonstration video of task, see
https://vimeo.com/395817200).
The visual scene was developed using the Unity 3D game engine version 5.6 (Unity Technologies, San Francisco, CA) and on an Alienware Aurora R6 desktop computer (Alienware Corporation, Miami, FL) with an Intel i5 processor (Intel Corporation, Mountain View, CA), NVidia GTX 1060 graphics card (NVidia Corporation, Santa Clara, CA), and 32 GB of RAM. The 3D human models were created in Adobe Fuse CC and rigged for animation in Adobe Mixamo (Adobe, San Jose, CA), and the 3D object models were created using Blender modeling software (Blender Foundation, Amsterdam, The Netherlands).
Participants were seated comfortably (60 cm away) in front of a 27-in. ViewSonic light-emitting diode, widescreen monitor (1080p, 1920 × 1080 resolution; ViewSonic Corporation, Brea, CA) (
Figure 1A). Search patterns (
x,
y coordinate positions of gaze on the screen) were captured using the Tobii Eye Tracker 4C (90-Hz sampling rate; Tobii, Danderyd, Sweden). Prior to the first experimental run, eye-tracking calibration was performed for each participant (Tobii Eye Tracking software, version 2.9, calibration protocol) which took less than 1 minute to complete. The process included a seven-point calibration task (screen positions: top–left, top–center, top–right, bottom–left, bottom–center, bottom–right, and center–center) followed by a nine-point post-calibration verification (same seven calibration points plus center–left and center–right). Accuracy was determined by gaze fixation falling within a 2.25° (arc degree) radius around each of the nine points and was further confirmed by inspection prior to commencing data collection.
Participants then viewed and selected their target of choice (i.e., the principal) from four options balanced for gender and race (see
Figure 1B). Participants viewed each principal sequentially and independently as they rotated about the
y-axis plane. Target selection was incorporated in order to enhance the immersive feel of the task and to confirm that the participant was able to correctly identify the principal in isolation before commencing the study. The interval between a target disappearing and reappearing in the hallway from trial to trial varied by 5 to 15 seconds. The duration of the visibility of the target was primarily determined by its starting point and path length; this varied between 5 and 17 seconds for the closest and farthest points, respectively.
The primary manipulation of interest was crowd density, which was achieved by varying the number of individuals walking in the hallway and ranged from 1 to 20 people. This factor was determined by the number of distractor individuals at a given time and was categorized as low, average of 5 ± 5 people; medium, average of 10 ± 5 people; or high, average of 15 ± 5 people (for examples of each level of crowd density, see
Figure 2). Note that there was partial overlap in these ranges, in part because distractors continuously entered and exited the hallway during an experimental run. However, categorization was determined by the sustained average number of distractors present over the course of a specific trial. The second factor of interest was the presence of visual clutter, which was manipulated to investigate the effect of scene complexity on search performance. The visual clutter condition included various objects typically found in a school hallway, such as lockers, water fountains, posters, and pictures. These objects were all absent in the no-clutter condition (for examples of clutter and no-clutter conditions, see
Figure 2). Visual clutter was present in 50% and absent in 50% of the trials and interleaved as part of a pseudorandom presentation order.
As part of the stimulus design, the target path trajectory was also varied. Specifically, the principal's walking path could originate from one of eight possible locations in the hallway, based on four possible starting distances and entering from either the right or left of the observer. The target would continue walking, either crossing in front of or remaining on the same side as the viewer (for path type options, see
Figure 2).
Participants completed three runs of the experiment with a brief rest period in between. Each run lasted approximately 3.5 minutes. Within a run, participants experienced an equal amount of trials for the two primary factors of interest (i.e., crowd density and visual clutter) and their respective conditions. Over the course of three runs, these factors were pseudorandomized and balanced in terms of presentation. The paths the target walked were not fully randomized but instead were constrained in order to ensure an equal sampling of starting positions (close vs. far points), path type (crossing vs. same side), and side of hallway (starting from the left vs. right side). As an example, a participant would have the same number of trials where the target crossed the screen for the low, medium, and high crowd densities, but not from every starting point or side. Thus, for each level of crowd density, participants experienced an equal number of trials for left/right starting points and door distance (but not every possible combination was covered for each level of crowd density). This was done to allow for a simpler factorial design while ensuring that the target path variables did not confound with the two primary manipulations of interest (crowd density and visual clutter).