May 2009
Volume 9, Issue 5
Free
Research Article  |   May 2009
Characterizing global and local mechanisms in biological motion perception
Author Affiliations
Journal of Vision May 2009, Vol.9, 8. doi:10.1167/9.5.8
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to Subscribers Only
      Sign In or Create an Account ×
    • Get Citation

      Dorita H. F. Chang, Nikolaus F. Troje; Characterizing global and local mechanisms in biological motion perception. Journal of Vision 2009;9(5):8. doi: 10.1167/9.5.8.

      Download citation file:


      © 2016 Association for Research in Vision and Ophthalmology.

      ×
  • Supplements
Abstract

The perception of biological motion is subserved by both a global process that retrieves structural information and a local process that is sensitive to individual limb motions. Here, we present an experiment aimed to characterize these two mechanisms psychophysically. Naive observers were tested on one of two tasks. In a walker detection task designed to address global processing, observers were asked to discriminate coherent from scrambled walkers presented in separate intervals. In an alternate direction discrimination task designed to address primarily local processing, observers were asked to discriminate walking direction from both coherent and spatially scrambled displays. In both tasks, we investigated performance-specificity to human (versus non-human) motion and the effects of mask density and learning on task performance. Performance in the walker detection task was best for the human walker, was susceptible to learning, and was heavily hindered by increasing mask densities. In contrast, performance on the direction discrimination task, in particular for the scrambled walkers, was unaffected by walker type, did not show a learning trend, and was relatively robust to masking noise. These findings suggest that the visual system processes global and local information contained in biological motion via distinct neural mechanisms that have very different properties.

Introduction
Visual mechanisms that underlie the perception of biological motion, that is, the motion patterns of animate beings (Johansson, 1973), seem to be in place early in human development. For example, 2-day-old infants prefer to look at a point-light walking hen rather than a display of randomly moving dots. These infants also show a looking preference for point-light hens that are oriented upright rather than inverted (Simion, Regolin, & Bulf, 2008). Infants aged 3–4 months can discriminate an upright human movement pattern as depicted by a set of moving point-lights from an inverted version of the same pattern or from a pattern of randomly moving dots (Bertenthal, Proffitt, & Cutting, 1984; Fox & McDaniel, 1982). At 5 months of age, infants can discriminate a point-light walker from one in which the dots' spatial organization and temporal phase are disrupted (Bertenthal, Proffitt, & Kramer, 1987) and can discriminate left- from right-facing point-light walkers (Kuhlmeier, Troje, & Lee, submitted). Infant sensitivity to biological motion has also been demonstrated electrophysiologically. To this end, Hirai and Haraki (2005) have shown that the amplitudes of event-related potentials (ERPs) are higher for intact than for scrambled point-light animations in 8-month-old infants. At this age, ERP amplitudes are also higher for point-light displays that are shown in the upright rather than in the inverted orientation (Reid, Hoehl, & Striano, 2006). Considered together, the developmental literature suggests that humans have the capacity to process biological motion within the first year of life and that at least some facilities are present in infants as young as several days. This raises the intriguing possibility that the mechanisms subserving the perception of biological motion may at least in part, be innate rather than acquired. 
Innate mechanisms for processing biologically relevant stimuli have been suggested in research of filial preference behavior of chicks. Newly hatched, visually naive chicks raised in the dark prefer to approach objects resembling adult conspecifics (i.e., stuffed hens) rather than less naturalistic objects (Johnson, Bolhuis, & Horn, 1985). Newly hatched chicks also respond to point-light animations and prefer to approach displays of intact or scrambled walking hens rather than displays of a rigidly rotating hen or of randomly moving dots (Vallortigara, Regolin, & Marconato, 2005). Furthermore, newly hatched chicks prefer to align with a display of an intact, upright walking hen rather than one in which the intact walker is inverted (Vallortigara & Regolin, 2006). Together, these results suggest that preference and orienting behaviors of newborn chicks are guided by visual invariants that are likely to emanate from their mother. 
Morton and Johnson (1991) proposed that the epigenetic mechanisms underlying the development of face perception in infants consist of two systems. They suggested that newborn infants possess an innate, ancestrally acquired mechanism termed conspec which contains some knowledge as to the general visual characteristics of conspecific faces. An innate predisposition for attending to faces is supported by findings that newborn infants prefer to track a moving schematic face rather than a scrambled face or a blank head outline (Goren, Sarty, & Wu, 1975; Maurer & Young, 1983). Importantly, this mechanism is presumed to guide attention toward faces ensuring that the developing visual system receives ample exposure to a stimulus class that it must learn so much about. A proposed second mechanism termed conlern, is responsible for exactly that: learning the sophisticated details that convey specific information about identity, emotion, and other aspects of human faces. 
It is conceivable that both innate and acquired visual mechanisms may also contribute to the perception of animate motion patterns in adult humans. This matter has received little consideration in the literature, however, perhaps because of the tendency to view biological motion perception as a single-mechanism phenomenon. Biological motion perception has generally been regarded as either a local phenomenon, relying foremost on motion signals of the individual dots (e.g., Mather, Radford, & West, 1992) or a global, form-from-motion phenomenon, relying predominantly on the display's spatiotemporal organization (e.g., Beintema & Lappe, 2002; Bertenthal & Pinto, 1994; Chatterjee, Freyd, & Shiffrar, 1996; Shiffrar, Lichtey, & Heptulla Chatterjee, 1997). Troje (2008) suggested that biological motion perception should instead be regarded as a multi-level phenomenon allowing for distinct contributions of a mechanism that exploits the local motion of the individual dots or body parts, and a second mechanism that uses motion to retrieve the global, dynamically changing shape of a body in motion. This view is supported by the finding of at least two distinct sources for the characteristic inversion effect in biological motion perception; that is, the impairment in perceiving point-light displays that are presented upside down (Troje & Westhoff, 2006). 
Impaired perceptual performance with display inversion has been demonstrated consistently in biological motion perception. For example, the abilities to recognize a walking dog (Pavlova, 1989), judge human action type (Dittrich, 1993), and discriminate gender (Barclay, Cutting, & Kozlowski, 1978) are disrupted upon inverting point-light displays. The inversion effect associated with biological motion perception has often been attributed to impaired processing of global, configural information (e.g., Bertenthal & Pinto, 1994). However, Troje and Westhoff (2006) have demonstrated that in addition to an inversion effect that may be attributable to the inversion of global form, there is a second inversion effect that relies on local motion conveyed by dots representing the feet of a walker. In their study, observers were presented with upright and inverted versions of point-light displays that were organized coherently or spatially scrambled (i.e., with individual dot trajectories displaced randomly) in a direction discrimination task. They found that observers were well able to discern walking direction not only from coherently organized displays, but also from spatially scrambled displays that lack any configural information. Critically, they showed that display inversion causes impairment in performance for both the coherent and scrambled displays. The inversion effect associated with scrambled displays cannot be attributed to impaired global form processing and must be local in nature. Further investigations revealed that the cues for direction of motion in scrambled displays and the associated inversion effect are carried by the feet of the walker. This finding is consistent with a previous report by Mather et al. (1992) regarding the importance of the foot motions in biological motion perception. Troje and Westhoff proposed that the visual mechanisms responsible for the local inversion effect may constitute an innate and non-specific life detection system that is distinct from an acquired system responsible for processing global shape that is required for more specific identification of an agent and its action (see also Johnson, 2006; Troje, 2008). To this end, we have shown that like the ability to discriminate direction, the perception of animacy from spatially scrambled point-light displays is also orientation-specific suggesting that the relevant local mechanisms may in fact convey information about animacy (Chang & Troje, 2008). The walking direction of scrambled displays can be well discriminated in the visual periphery as well (Gurnsey, Roddy, Ouhnana, & Troje, 2008; Thompson, Hansen, Hess, & Troje, 2007). In addition, a recent study by Jiang, Zhang, and He (submitted) showed that upright, but not inverted, scrambled biological motion attracts attention and is processed automatically. 
Based on findings with neonate infants, chicks, and the data obtained from human adult observers, we hypothesize that the two mechanisms work together in a manner similar to that of the mechanisms proposed by Morton and Johnson (1991) for face perception. We suggest that local motion drives an early mechanism which is evolutionarily old and possibly innate. It may serve as a general detection system that directs attention to a stimulus class of great importance. That there may be evolutionarily acquired mechanisms toward identifying humans and animals has been suggested previously (New, Cosmides, & Tooby, 2007). New et al. (2007) hypothesized that humans have evolved a higher level of spontaneous recruitment of attention to humans and other animals, due to their ancestral values for survival and social opportunities, as compared to objects. Indeed, they found that performances on a change detection task in which observers were required to detect a difference between scenes that are identical except for a change in one target, were better when changes involved an animate rather than an inanimate target (e.g., a vehicle). 
Once an animate target is detected, it can then be foveated and other mechanisms can be employed to further investigate it. We suggest that a crucial step at this level is the retrieval of the articulated, dynamically changing shape of the body of an animal or human. In the case of biological motion point-light displays, information about the articulation of the distinct dots is carried by the partly rigid motion of the segments of the body. The motion-mediated shape is eventually analyzed to obtain more detailed information about the nature of the animal and its actions. 
Here, we present an experiment aimed to characterize the mechanisms responsible for the retrieval of global, motion-mediated shape, on the one hand, and for the retrieval of information contained in the local motion of individual dots, on the other hand. Note that we use the terms “local” and “global” in a very restricted and specific way. Our distinction between global, motion-mediated shape and local motion is not related to the usage of the terms “local motion” versus “global motion” when distinguishing between (local) object motion and (global) self-induced optic flow. Our usage of these terms is also not related to the intrinsic non-rigid (local) deformation of an articulated body and its overall (global) translation or rotation in space. 
We tested observers on one of two tasks. One group of observers was tested on a walker detection task that addressed the global aspect of biological motion perception—requiring the observer to segregate the coherent structure of a walker from a mask of dots with similar local motion (i.e., a scrambled walker mask). The use of such a mask in this task renders the individual local trajectories of the walker uninformative and requires the observers to detect the structure of the walker. A second group of observers was asked to discriminate walking direction from both coherent displays (containing global and local information) and spatially scrambled displays (containing local information only). Walkers in this second task were embedded in a mask of randomly positioned flickering dots in order to retain informative local trajectories, in particular for the scrambled walker which, in contrast to the coherent walker, does not contain global structural information. Note that although coherent displays were included in order to render the task more rewarding for observers, the second task was designed to investigate primarily direction retrieval from local biological motions (scrambled displays). In both tasks, we investigated the effects of walker type, mask density, and learning on task performance. In addition to examining these factors and comparing their effects between the two tasks, we also compared their effects between coherent and scrambled displays in the second task. 
In consideration of the claimed distinction between local and global mechanisms and their proposed innate and acquired natures, respectively (Troje, 2008), several predictions can be made regarding the behavior of mechanisms underlying the retrieval of local and global information contained in biological motion. If in fact, the relevant local mechanisms are ancestrally acquired and therefore innate, they may be insensitive to how familiar an observer is with a particular type of walker and may be insensitive to learning effects. In contrast, the relevant global mechanisms, presumed to be acquired individually, may be affected by the familiarity of the walker and may be subject to learning. Furthermore, if the role of the local mechanisms is to guide attention to animate motions, they may be pre-attentive in nature (Troje, 2008). As such, the local mechanisms, as compared to the global mechanisms may be more tolerant of masking noise. Importantly, any differences between the patterns of results for detecting/discriminating direction of coherent walkers and discriminating direction of scrambled displays would suggest that local and global information contained in biological motion patterns are processed by neural mechanisms with differing properties. 
Methods
Participants
Two groups of naive observers participated in this experiment. Group 1 consisted of 12 observers that ranged in age from 17 to 26 years (mean age of 19.2 years; 4 males, 8 females). Group 2 consisted of 12 different observers that ranged in age from 17 to 26 years (mean age of 19.3 years; 4 males, 8 females). All observers had normal or corrected-to-normal vision. 
Stimuli and apparatus
The stimuli were derived from point-light sequences of a walking human, cat, and pigeon. The human walker was computed as the average walker from motion-captured data of 50 men and 50 women (Troje, 2002) and was represented by a set of 11 dots. The cat sequence was created by sampling 14 points from single frames of a video sequence showing a cat walking on a treadmill. The pigeon sequence was created from motion-captured data obtained from a pigeon fitted with 11 markers. All walkers were presented in sagittal view (i.e., facing rightwards or leftwards) and displayed stationary walking as if walking on a treadmill. Point-light sequences were shown at their veridical speeds with gait frequencies of 0.93 Hz, 1.7 Hz, and 1.6 Hz for the human, cat, and pigeon, respectively. On each trial, the starting position of the walker within its gait cycle was selected randomly. 
The point-light walkers were presented upright and could be coherently organized (with all points maintaining veridical organization) or spatially scrambled (with points displaced randomly within areas matched to those occupied by the corresponding coherent versions). For observers in Group 1, right-facing coherent point-light walkers were embedded in a 6.4 × 6.4 deg scrambled walker mask. This mask was comprised of walker dots (i.e., dots carrying veridical trajectories of the walker) displaced randomly in space. The number of walker dots was sampled in five steps logarithmically from 50 to 150. For observers in Group 2, right- and left-facing walkers were embedded in a 6.4 × 6.4 deg random dot mask. This mask was comprised of randomly positioned stationary dots with a limited lifetime of 125 ms after which the dots were redrawn at new, randomly assigned locations. The number of dots in the random dot mask was sampled in five steps logarithmically from 50 to 750. For this group, target walkers were either coherently organized or spatially scrambled. Positions of the dots for the spatially scrambled walker were randomly selected on each trial. 
The stimuli were generated using MATLAB (Mathworks, Natick, MA) with extensions from the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997), and were displayed on a 22 inch ViewSonic P220f CRT color monitor with 0.25 mm dot pitch, 1280 × 1024 pixels spatial resolution, and 100 Hz frame rate. All stimuli appeared as white dots on a black background and the point-light figures subtended visual angles of 2.1 × 4.6 deg, 4.6 × 2.4 deg, and 3.6 × 3.6 deg for the human, cat, and pigeon, respectively. 
Procedure
Observers completed one of two tasks: observers in Group 1 completed a detection task while observers in Group 2 completed a direction discrimination task. For both groups, participants were first instructed on the task both verbally and by printed instructions on the computer screen. A practice block was then presented during which participants familiarized themselves with the task. After the practice block, participants completed the experiment proper. For both groups, stimuli were viewed binocularly at a distance of 80 cm as maintained by a chin-rest. 
Coherent walker detection task
Using a two-interval alternative forced-choice paradigm, observers in Group 1 were presented with two consecutive intervals on each trial: one interval containing a coherent walker and an alternate interval containing a scrambled walker, each embedded in a mask of additional scrambled walker dots. The task was to indicate the interval containing the coherent walker. On each trial, the two 1 s display intervals were separated by a 0.5 s intertrial interval during which the screen remained blank. All walkers presented here were right-facing. 
The practice block comprised of 6 trials that consisted of unmasked versions of all three animal types and 6 trials which showed the same displays embedded in a scrambled walker mask of 10 dots. The experiment proper consisted of 300 trials in total that were completed across three different experimental blocks of 100 trials, each featuring one of the three animal types. The order of test blocks was counterbalanced among participants. Each experimental block consisted of five sub-blocks of trials, each of which was comprised of two repetitions of all possible combinations of the interval of appearance (first or second) and mask density (50, 66, 87, 114, 150 scrambled walker dots). Within each repetition, the order of stimulus presentation was randomized. Feedback was not given for correct/incorrect responses. 
Direction discrimination task
Observers in Group 2 were presented with a single 1 s display that consisted of a coherent or a scrambled walker embedded in a random dot mask. Here, a direction discrimination paradigm was used whereby the task was to decide whether the walker appeared to be facing leftwards or rightwards. 
The practice block here comprised of 12 trials consisting of unmasked versions of all possible combinations of the three animal types, two stimulus organizations (coherent and scrambled), and two facing directions, and 12 trials consisting of masked versions of these same combinations. For the practice block, the random dot mask consisted of 10 dots. The experiment proper consisted of 300 trials that were completed across three different experimental blocks of 100 trials, each featuring one of the three animal types. The order of test blocks was counterbalanced among participants. Each experimental block consisted of five sub-blocks of trials, each consisting of all possible combinations of stimulus organization, mask density (50, 98, 194, 381, 750 random dots), and facing direction. Within each sub-block, the order of stimulus presentation was randomized. Feedback was not given for correct/incorrect responses. 
Results
Coherent walker detection
Due to the dependent and continuous natures of the factors sub-block and mask density, the walker detection data were entered in a multiple regression analysis adjusted for repeated-measures data (Lorch & Myers, 1990) that evaluated mask density, sub-block, and animal type as predictors for detection errors. Briefly, a multiple linear regression model was first fitted to the data of each individual. Individual coefficients obtained from these fits were then evaluated with two-tailed t tests. The linear model as given by the mean coefficients obtained from observers in Group 1 is  
e = 0.49 + 0.46 log m 0.03 s 0.05 a ⁢1 + 0.01 a ⁢2 ,
(1)
where error rate (e) is predicted by the logarithm of mask density (m), sub-block (s), and animal type (a1 and a2 in this particular equation binary-code for the human and cat stimulus types, respectively). The analyses indicated that mask density, t(11) = 7.360, p < 0.001, and sub-block, t(11) = −2.805, p = 0.017 were significant predictors of error rate. The contribution of the human walker stimulus type to the model was also different from that of the cat and the pigeon walkers, t(11) = −3.395, p = 0.006, which did not differ, t(11) = 0.561, p = 0.586. Specifically, error rates were lower for the human walker stimulus (mean = 0.26) than for the cat (mean = 0.32) and pigeon (mean = 0.32) stimuli. 
Detection performances, expressed in terms of error rates for the three animal types at the five mask densities, are presented in Figure 1. As reflected by Equation 1 and the figure, error rates generally increase with increases in mask density. 
Figure 1
 
Walker detection performances, expressed in terms of error rates for the three walker types and five mask densities. Error bars represent ±1 standard error of the mean.
Figure 1
 
Walker detection performances, expressed in terms of error rates for the three walker types and five mask densities. Error bars represent ±1 standard error of the mean.
Detection performances for the different sub-blocks collapsed across animal type and mask density are presented in Figure 2. Here, decreased error rates with the later sub-blocks are evident. 
Figure 2
 
Walker detection performances, expressed in terms of error rates for the five learning sub-blocks, collapsed across walker type and mask densities. Error bars represent ±1 standard error of the mean.
Figure 2
 
Walker detection performances, expressed in terms of error rates for the five learning sub-blocks, collapsed across walker type and mask densities. Error bars represent ±1 standard error of the mean.
Direction discrimination
The direction discrimination data were analyzed with a repeated-measures multiple regression model that evaluated factors of mask density, sub-block, animal type, and stimulus organization as predictors for error rate. In order to further examine the effect of stimulus organization on the relationships between error rate and each of mask density, sub-block, and animal type, three interaction terms were also included in this analysis. The linear model as given by the mean coefficients obtained from observers in Group 2 is  
e = 0.12 + 0.18 log m + 0.01 s + 0.02 a ⁢1 + 0.04 a ⁢2 0.26 c + 0.08 c log m 0.02 c s 0.02 c a ⁢1 0.05 c a ⁢2 ,
(2)
where error rate ( e) is predicted by the logarithm of mask density ( m), sub-block ( s), organization ( c), and animal type (human and cat stimulus types binary-coded here by a 1 and a 2, respectively). Individual two-tailed t tests for the regression coefficients indicated that mask density, t(11) = 7.134, p < 0.001, stimulus organization, t(11) = −2.260, p = 0.043, and the stimulus organization by mask density interaction, t(11) = 2.399, p = 0.035, were significant predictors of error rate. All other coefficients were not significantly different from zero. 
Direction discrimination performances, expressed in terms of error rates for the three animal types at the two stimulus organizations and five mask densities are presented in Figure 3A. As reflected in the regression model and the figure, the error rates for coherent stimuli (mean = 0.12) were lower than error rates for scrambled stimuli (mean = 0.32). An examination of this figure reveals also a general increase in error rates with increases in mask density, as predicted by Equation 2
Figure 3
 
(A) Direction discrimination performances, expressed in terms of error rates for coherent and scrambled versions of the three walker types across five mask densities. (B) Mean error rates for coherent and scrambled conditions collapsed across animal type plotted against mask densities. Superimposed linear regression lines carry the mean slopes obtained by plotting error rates against mask density on a logarithmic scale for individual participants. Error bars represent ±1 standard error of the mean.
Figure 3
 
(A) Direction discrimination performances, expressed in terms of error rates for coherent and scrambled versions of the three walker types across five mask densities. (B) Mean error rates for coherent and scrambled conditions collapsed across animal type plotted against mask densities. Superimposed linear regression lines carry the mean slopes obtained by plotting error rates against mask density on a logarithmic scale for individual participants. Error bars represent ±1 standard error of the mean.
Figure 3B re-plots the data for coherent and scrambled conditions across the various mask densities, collapsed across animal type. To better illustrate the interaction reflected by the regression model, a further analysis of these two factors was performed with a two-tailed, paired t test comparing coherent and scrambled conditions in terms of regression slopes for individual participant data plotted on a logarithmic scale. That is, for each participant, linear regression slopes were obtained by plotting error rates against mask densities on a logarithmic scale, separately for coherent and scrambled conditions and collapsed across all other factors. The mean slopes are depicted by the regression lines in Figure 3B. As evident in the figure, the increase of errors with increasing mask density was larger for coherent than for scrambled stimuli, t(11) = 2.399, p = 0.035. 
Direction discrimination performances of coherent and scrambled displays for the different sub-blocks collapsed across animal type and mask density are presented in Figure 4. As noted earlier, stimulus organization did not significantly alter the relationship between sub-block and error rate with all other variables held constant. Although not confirmed by the statistical analyses, the figure implies that some degree of learning might have occurred over the first three blocks for the coherent stimuli only. 
Figure 4
 
Direction discrimination performances, expressed in terms of error rates for coherent and scrambled stimuli for the five learning sub-blocks, collapsed across walker type and mask densities. Error bars represent ±1 standard error of the mean.
Figure 4
 
Direction discrimination performances, expressed in terms of error rates for coherent and scrambled stimuli for the five learning sub-blocks, collapsed across walker type and mask densities. Error bars represent ±1 standard error of the mean.
Discussion
With two tasks, we investigated the effects of walker familiarity (human vs. non-human), mask density, and learning on the ability to discriminate a coherent walker from a scrambled walker (walker detection task) and the ability to discriminate direction from both coherent and scrambled walker displays. While the first task relied on the observer's ability to retrieve the global motion-mediated shape of the walker, the second task (at least where the scrambled walker was involved) allowed us to assess the observer's ability to retrieve information from the local motion of the individual dots. 
The effects of the three factors on the two tasks were very different. Walker type strongly affected the detection task, which requires the retrieval of global form information. Specifically, error rates were lower for the familiar human walker stimuli than for the cat and pigeon stimuli. In contrast, walker type had no effect on the direction discrimination task, which was designed to address primarily visual processing of local motions. The two tasks were also differentially affected by learning. While error rates decreased at the later test blocks for the detection task, no learning effect was observed for the direction discrimination task. For the coherent displays in the direction discrimination task, performance was generally very high and the lack of effects of animal type and learning may be due to a ceiling effect. More important, however, is the lack of such effects for the scrambled displays. Here, performance was lower and the lack of effects of walker type and learning cannot be explained by ceiling performance. Comparing the effect of masking between the detection and direction discrimination tasks is less straightforward as we used very different masks in the two tasks. In the detection task, the mask consisted of additional, scrambled walker dots, whereas in the direction discrimination task, the mask consisted of stationary flickering dots. Common to both tasks is an effect of mask density. 
Considered alone, the increase in error rates with increases in mask density on the walker detection task replicates previous findings involving a comparable mask and task (Hiris, 2007). In this study, Hiris (2007) measured detection sensitivity for a coherent point-light walker, a non-biological structured rotating stimulus, and a non-biological unstructured rotating stimulus, all of which were either embedded in a random-mask or in a scrambled motion mask (in which individual local motions were identical to those of the targets) with varying densities. Interestingly, for the scrambled motion mask, detection sensitivity did not differ between the point-light walker and the non-biological structured stimulus. The author therefore argued that differences between the ability to detect a biological motion stimulus and a non-biological motion stimulus can be explained by the fact that biological motion always contains underlying structure. 
Critically, the comparison of interest for the present purposes is between the coherent and scrambled stimuli in regards to the effect of mask density within the direction discrimination task. As noted earlier, for the coherent walkers, observers could potentially employ both configural information and local motion information as cues to the facing direction of the walker. As such, there is no question that more information is contained in the coherent walker (in the form of additional structure) than in the scrambled walker, where only local motion information is available. Correspondingly, the error rates for the coherent walkers were significantly lower than those for the scrambled walkers. As depicted in Figure 3B, the rate of increase in errors with increases in mask density was different for coherent than for scrambled walkers. Specifically, the errors increased more substantially for coherent than for scrambled walkers with increasing mask density. It should be noted that the manner in which the random-dot mask affects the information contained in the two types of walkers is unknown however. It is possible that increases in density of the random-dot mask may more readily disrupt configural information contained in the coherent walkers as compared to local motion information contained in the scrambled walkers. 
The differences in the patterns of effects observed between detecting/discriminating the direction of a coherent walker and discriminating the direction of a scrambled walker suggest that global and local cues contained in biological motion are retrieved from visual mechanisms with differing properties. Our findings for the two tasks fit well with the hypotheses set forth by Troje and Westhoff (2006) and Troje (2008); that is, the proposition of an innate, ancestrally acquired mechanism that is sensitive to invariants contained in the local motion of individual dots and that is distinct from an ontogenetically trained mechanism that retrieves global form in biological motion perception. The direction discrimination results, in particular for the scrambled walkers that contain solely local cues, are consistent with the hypothesis that the local mechanism is non-specific, hard-wired, and possibly innate. Consistent with our predictions, there were no effects of walker type or learning. In addition, performances were relatively tolerant of masking noise, supporting the hypothesis of a robust and possibly pre-attentive local system. In contrast, an individually acquired global processing system may be sensitive to walker type and should exhibit learning effects. The results for our walker detection task are consistent with these latter predictions. We show further that the global system is also comparatively more susceptible to masking noise. 
It is worth noting that an expertise-dependent nature for the global mechanism in biological motion perception is supported by several experimental findings (Fox & McDaniel, 1982; Grossman, Blake, & Kim, 2004; Jastorff, Kourtzi, & Giese, 2006). In one study, Fox and McDaniel (1982) tested the sensitivity of infants aged 2, 4, and 6 months to coherent biological motion patterns using a forced preferential looking technique. On each trial of the relevant experiment, two stimuli were presented side by side: one display of a point-light human runner organized coherently and an alternate display of randomly moving dots. The authors found visual preference for the coherent biological motion pattern as opposed to the random-dot display in 4- and 6-month-olds but not in 2-month-old infants. This suggests then that the global mechanism in biological motion perception does not develop until some time between 2 and 4 months of postnatal life. At first glance, the findings of Fox and McDaniel may appear to be inconsistent with those reported by Simion et al. (2008). However, the fact that the newborn infants in the study by Simion et al. preferred to look at point-light animations of a non-human animal (hen) suggests that their preferences may have been guided by form-invariant, local cues rather than global shape. 
Using functional magnetic resonance imaging (fMRI), perceptual learning of coherent biological motion patterns and its effects on neural activity have also been investigated in adults (Grossman et al., 2004). In this study, the authors trained observers to discriminate coherent from scrambled biological motion patterns embedded in noise dots and measured neural activity before and after training. Consistent with our results reported here, behavioral performances of the adults in this study improved from pre- to post-training. Importantly, the behavioral improvements were paralleled by increased activity after training in the posterior superior temporal sulcus (STS) in response to the coherent patterns as compared to pre-training neural activity. 
The posterior STS has often been implicated in the perception of coherent biological motion patterns (e.g., Bonda, Petrides, Ostry, & Evans, 1996; Grossman & Blake, 2001; Grossman et al., 2000; Oram & Perrett, 1994; Vaina, Solomon, Chowdhury, Sinha, & Belliveau, 2001). For example, using fMRI, Grossman et al. (2000) showed that the STS is responsive to coherent point-light displays but is not activated by scrambled point-light displays, coherent motion, or kinetic boundaries. Using positron emission tomography (PET), Bonda et al. (1996) have also shown this area to be active when observers view point-light animations. Thus, this region may well be implicated in the system that processes global form information from biological motion patterns. The possible neural sites for the processing of local biological motions are less clear. It has been shown recently however that the extrastriate areas V3 and V3A are differentially responsive to upright and inverted versions of scrambled biological motion displays (Jiang & He, 2007). 
Nonetheless, it is clear that the local and global mechanisms underlying biological motion perception can be regarded as distinct processing modules. While it is assumed that the global mechanism is sensitive to the shape of the walker, it is still unclear as to which motion properties the local mechanisms are tuned to. Troje and Westhoff (2006) showed that the inversion effect associated with spatially scrambled walkers is carried by the motions of the feet. In an earlier study, Mather et al. (1992) showed that performances on coherence and direction discrimination tasks involving point-light walkers were most affected by the omission of wrist and ankle dots as opposed to the omission of the shoulder and hip, elbow and knee dots. Thus, converging lines of evidence suggest that the cues of interest to the relevant local mechanism are contained in the motions of the feet. What is the nature of these cues? To this end, we have investigated the local inversion effect in biological motion perception with novel stimuli that display solely foot-specific information (Chang & Troje, 2009). In one experiment, we compared direction discrimination performances for displays containing naturally accelerating foot motions with those containing constant speeds (i.e., with accelerations removed along the trajectory paths). We found that the inversion effect holds for the naturally accelerating stimuli but not the constant speed stimuli suggesting that acceleration contained in the foot motion carries the local motion-based inversion effect in biological motion perception. In light of these findings, we then conjecture that the local processing system in biological motion perception is based on knowledge about the characteristic acceleration pattern contained in foot motions as an animal moves through the environment which is constrained by gravity, inertia, and the general kinetics of moving bodies. 
In sum, recent findings of at least two sources for the inversion effect in biological motion perception have prompted the need to consider separately, the contributions of global and local motion processes (Troje & Westhoff, 2006). Here, we have characterized these two processes at the behavioral level. While performances presumed to rely on the global mechanism are sensitive to walker type and learning effects, and are heavily affected by masking noise, performances presumed to exploit the local mechanism are not affected by varying walker types and learning effects, and are relatively robust to masking noise. These findings suggest that the human visual system exploits both information contained in global, motion-mediated form, and information contained in the local motion of individual body parts, and that the retrieval of these two types of information are governed by distinct mechanisms with very different properties. 
Acknowledgments
This research was supported by the Canada Foundation for Innovation (CFI), a NSERC Discovery grant, the NCAP program of the Canadian Institute for Advanced Research, and the Canada Research Chair program. 
Commercial relationships: none. 
Corresponding author: Nikolaus Troje. 
Email: troje@queensu.ca. 
Address: Department of Psychology, Queen's University, Kingston, Ontario, K7L 3N6, Canada. 
References
Barclay, C. D. Cutting, J. E. Kozlowski, L. T. (1978). Temporal and spatial factors in gait perception that influence gender recognition. Perception & Psychophysics, 23, 145–152. [PubMed] [CrossRef] [PubMed]
Beintema, J. A. Lappe, M. (2002). Perception of biological motion without local image motion. Proceedings of the National Academy of Sciences of the United States of America, 99, 5661–5663. [PubMed] [Article] [CrossRef] [PubMed]
Bertenthal, B. I. Pinto, J. (1994). Global processing of biological motions. Psychological Science, 5, 221–225. [CrossRef]
Bertenthal, B. I. Proffitt, D. R. Cutting, J. E. (1984). Infant sensitivity to figural coherence in biomechanical motions. Journal of Experimental Child Psychology, 37, 213–230. [PubMed] [CrossRef] [PubMed]
Bertenthal, B. I. Proffitt, D. R. Kramer, S. J. (1987). Perception of biomechanical motion by infants: Implementation of various processing constraints. Journal of Experimental Psychology: Human Perception and Performance, 13, 577–585. [PubMed] [CrossRef] [PubMed]
Bonda, E. Petrides, M. Ostry, D. Evans, A. (1996). Specific involvement of human parietal systems and the amygdala in the perception of biological motion. Journal of Neuroscience, 16, 3737–3744. [PubMed] [Article] [PubMed]
Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision, 10, 433–436. [PubMed] [CrossRef] [PubMed]
Chang, D. H. F. Troje, N. F. (2008). Perception of animacy and direction from local biological motion signals. Journal of Vision, 8, (5):3, 1–10, http://journalofvision.org/8/5/3/, doi:10.1167/8.5.3. [PubMed] [Article] [CrossRef] [PubMed]
Chang, D. H. F. Troje, N. F. (2009). Acceleration carries the local inversion effect in biological motion perception. Journal of Vision, 9, (1):19, 1–17, http://journalofvision.org/9/1/19/, doi:10.1167/9.1.19. [PubMed] [Article] [CrossRef] [PubMed]
Dittrich, W. H. (1993). Action categories and the perception of biological motion. Perception, 22, 15–22. [PubMed] [CrossRef] [PubMed]
Fox, R. McDaniel, C. (1982). The perception of biological motion by human infants. Science, 218, 486–487. [PubMed] [CrossRef] [PubMed]
Goren, C. C. Sarty, M. Wu, P. Y. (1975). Visual following and pattern discrimination of face-like stimuli by newborn infants. Pediatrics, 56, 544–549. [PubMed] [PubMed]
Grossman, E. D. Blake, R. (2001). Brain activity evoked by inverted and imagined biological motion. Vision Research, 41, 1475–1482. [PubMed] [CrossRef] [PubMed]
Grossman, E. D. Blake, R. Kim, C. Y. (2004). Learning to see biological motion: Brain activity parallels behavior. Journal of Cognitive Neuroscience, 16, 1669–1679. [PubMed] [CrossRef] [PubMed]
Grossman, E. Donnelly, M. Price, R. Pickens, D. Morgan, V. Neighbor, G. (2000). Brain areas involved in perception of biological motion. Journal of Cognitive Neuroscience, 12, 711–720. [PubMed] [CrossRef] [PubMed]
Gurnsey, R. Roddy, G. Ouhnana, M. Troje, N. F. (2008). Stimulus magnification equates identification and discrimination of biological motion across the visual field. Vision Research, 48, 2827–2834. [PubMed] [CrossRef] [PubMed]
Chatterjee, S. H. Freyd, J. J. Shiffrar, M. (1996). Configural processing in the perception of apparent biological motion. Journal of Experimental Psychology: Human Perception and Performance, 22, 916–929. [PubMed] [CrossRef] [PubMed]
Hirai, M. Hiraki, K. (2005). An event-related potentials study of biological motion perception in human infants. Cognitive Brain Research, 22, 301–304. [PubMed] [CrossRef] [PubMed]
Hiris, E. (2007). Detection of biological and nonbiological motion. Journal of Vision, 7, (12):4, 1–16, http://journalofvision.org/7/12/4/, doi:10.1167/7.12.4. [PubMed] [Article] [CrossRef] [PubMed]
Jastorff, J. Kourtzi, Z. Giese, M. A. (2006). Learning to discriminate complex movements: Biological versus artificial trajectories. Journal of Vision, 6, (8):3, 791–804, http://journalofvision.org/6/8/3/, doi:10.1167/6.8.3. [PubMed] [Article] [CrossRef]
Jiang, Y. He, S. (2007). Isolating the neural encoding of the local motion component in biological motion [Abstract]. Journal of Vision, 7, (9):551,
Jiang, Y. Zhang, Y. He, S. (submitted). Detecting life motion: Automatic processing of local biological motion information.
Johansson, G. (1973). Visual perception of biological motion and a model for its analysis. Perception & Psychophysics, 14, 201–211. [CrossRef]
Johnson, M. H. (2006). Biological motion: A perceptual life detector? Current Biology, 16, R376–R377. [PubMed] [Article] [CrossRef] [PubMed]
Johnson, M. H. Bolhuis, J. J. Horn, G. (1985). Interaction between acquired preferences and developing predispositions during imprinting. Animal Behaviour, 33, 1000–1006. [CrossRef]
Kuhlmeier, V. Troje, N. F. Lee, V. (submitted). Young infants detect the direction of biological motion in point-light displays.
Lorch, Jr., R. F. Myers, J. L. (1990). Regression analyses of repeated measures data in cognitive research. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, 149–157. [PubMed] [CrossRef] [PubMed]
Mather, G. Radford, K. West, S. (1992). Low-level visual processing of biological motion. Proceedings of the Royal Society of London B: Biological Sciences, 249, 149–155. [PubMed] [CrossRef]
Maurer, D. Young, R. (1983). Newborns' following of natural and distorted arrangements of facial features. Infant Behavior and Development, 6, 127–131. [CrossRef]
Morton, J. Johnson, M. H. (1991). CONSPEC and CONLERN: A two-process theory of infant face recognition. Psychological Review, 98, 164–181. [PubMed] [CrossRef] [PubMed]
New, J. Cosmides, L. Tooby, J. (2007). Category-specific attention for animals reflects ancestral priorities, not expertise. Proceedings of the National Academy of Sciences of the United States of America, 104, 16598–16603. [PubMed] [Article] [CrossRef] [PubMed]
Pavlova, M. (1989). The role of inversion in perception of biological motion pattern. Perception, 18, 510.
Oram, M. W. Perrett, D. I. (1994). Responses of anterior superior temporal polysensory (STPa neurons to “biological motion” stimuli. Journal of Cognitive Neuroscience, 6, 99–116. [CrossRef] [PubMed]
Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10, 437–442. [PubMed] [CrossRef] [PubMed]
Reid, V. M. Hoehl, S. Striano, T. (2006). The perception of biological motion by infants: An event-related potential study. Neuroscience Letters, 395, 211–214. [PubMed] [CrossRef] [PubMed]
Shiffrar, M. Lichtey, L. Heptulla Chatterjee, S. (1997). The perception of biological motion across apertures. Perception & Psychophysics, 59, 51–59. [PubMed] [CrossRef] [PubMed]
Simion, F. Regolin, L. Bulf, H. (2008). A predisposition for biological motion in the newborn baby. Proceedings of the National Academy of Sciences of the United States of America, 105, 809–813. [PubMed] [Article] [CrossRef] [PubMed]
Thompson, B. Hansen, B. C. Hess, R. F. Troje, N. F. (2007). Peripheral vision: Good for biological motion, bad for signal noise segregation? Journal of Vision, 7, (10):12, 1–7, http://journalofvision.org/7/10/12/, doi:10.1167/7.10.12. [PubMed] [Article] [CrossRef] [PubMed]
Troje, N. F. (2002). Decomposing biological motion: A framework for analysis and synthesis of human gait patterns. Journal of Vision, 2, (5):2, 371–387, http://journalofvision.org/2/5/2/, doi:10.1167/2.5.2. [PubMed] [Article] [CrossRef]
Troje, N. F. Basbaum,, A. (2008). Biological motion perception. The senses: A comprehensive reference. (pp. 231–238). Oxford: Elsevier.
Troje, N. F. Westhoff, C. (2006). The inversion effect in biological motion perception: Evidence for a “life detector”? Current Biology, 16, 821–824. [PubMed] [Article] [CrossRef] [PubMed]
Vaina, L. M. Solomon, J. Chowdhury, S. Sinha, P. Belliveau, J. W. (2001). Functional neuroanatomy of biological motion perception in humans. Proceedings of the National Academy of Sciences of the United States of America, 98, 11656–11661. [PubMed] [Article] [CrossRef] [PubMed]
Vallortigara, G. Regolin, L. (2006). Gravity bias in the interpretation of biological motion by inexperienced chicks. Current Biology, 16, R279–R280. [PubMed] [Article] [CrossRef] [PubMed]
Vallortigara, G. Regolin, L. Marconato, F. (2005). Visually inexperienced chicks exhibit a spontaneous preference for biological motion patterns. PLoS Biology, 3,
Figure 1
 
Walker detection performances, expressed in terms of error rates for the three walker types and five mask densities. Error bars represent ±1 standard error of the mean.
Figure 1
 
Walker detection performances, expressed in terms of error rates for the three walker types and five mask densities. Error bars represent ±1 standard error of the mean.
Figure 2
 
Walker detection performances, expressed in terms of error rates for the five learning sub-blocks, collapsed across walker type and mask densities. Error bars represent ±1 standard error of the mean.
Figure 2
 
Walker detection performances, expressed in terms of error rates for the five learning sub-blocks, collapsed across walker type and mask densities. Error bars represent ±1 standard error of the mean.
Figure 3
 
(A) Direction discrimination performances, expressed in terms of error rates for coherent and scrambled versions of the three walker types across five mask densities. (B) Mean error rates for coherent and scrambled conditions collapsed across animal type plotted against mask densities. Superimposed linear regression lines carry the mean slopes obtained by plotting error rates against mask density on a logarithmic scale for individual participants. Error bars represent ±1 standard error of the mean.
Figure 3
 
(A) Direction discrimination performances, expressed in terms of error rates for coherent and scrambled versions of the three walker types across five mask densities. (B) Mean error rates for coherent and scrambled conditions collapsed across animal type plotted against mask densities. Superimposed linear regression lines carry the mean slopes obtained by plotting error rates against mask density on a logarithmic scale for individual participants. Error bars represent ±1 standard error of the mean.
Figure 4
 
Direction discrimination performances, expressed in terms of error rates for coherent and scrambled stimuli for the five learning sub-blocks, collapsed across walker type and mask densities. Error bars represent ±1 standard error of the mean.
Figure 4
 
Direction discrimination performances, expressed in terms of error rates for coherent and scrambled stimuli for the five learning sub-blocks, collapsed across walker type and mask densities. Error bars represent ±1 standard error of the mean.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×