November 2018
Volume 18, Issue 12
Open Access
Article  |   November 2018
Magnetoencephalography adaptation reveals depth-cue-invariant object representations in the visual cortex
Author Affiliations
Journal of Vision November 2018, Vol.18, 6. doi:https://doi.org/10.1167/18.12.6
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Hassan Akhavein, Armita Dehmoobadsharifabadi, Reza Farivar; Magnetoencephalography adaptation reveals depth-cue-invariant object representations in the visual cortex. Journal of Vision 2018;18(12):6. https://doi.org/10.1167/18.12.6.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Independent of edges and 2-D shape that can be highly informative of object identity, depth cues alone can also give rise to vivid and effective object percepts. The processing of different depth cues engages segregated cortical areas, and an efficient object representation would be one that is invariant to depth cues. Here, we investigated depth-cue invariance of object representations by measuring the category-specific response to faces—the M170 response measured with magnetoencephalography. The M170 response is strongest to faces and is sensitive to adaptation, such that repeated presentation of a face diminishes subsequent M170 responses. We used this feature of the M170 and measured the degree to which the adaptation effect is affected by variations in depth cue and 3-D object shape. Subjects viewed a rapid presentation of two stimuli—an adaptor and a test stimulus. The adaptor was either a face, a chair, or a face-like oval surface, and rendered with a single depth cue (shading, structure from motion, or texture). The test stimulus was always a shaded face of a random identity, thus completely controlling for low-level influences on the M170 response to the test stimulus. In the left fusiform face area, we found strong M170 adaptation when the adaptor was a face regardless of its depth cue. This adaptation was marginal in the right fusiform and negligible in the occipital regions. Our results support the presence of depth-cue-invariant representations in the human visual system, alongside size, position, and viewpoint invariance.

Introduction
Humans can identify objects from a variety of depth cues, such as shading, texture, and structure from motion (SFM). Previous studies have suggested distinct cortical mechanisms in encoding different depth cues. For example, SFM is believed to be encoded mostly by dorsal visual areas (Heeger, Boynton, Demb, Seidemann, & Newsome, 1999; Kamitani & Tong, 2006; Tootell et al., 1995; Zeki et al., 1991), while it has been suggested that shading and texture are processed in ventral regions (Merigan, 2000; Merigan & Pham, 1998). But recent studies have revealed cortical areas responsive to multiple depth cues in both the ventral and dorsal pathways (Ban, Preston, Meeson, & Welchman, 2012; Y. Liu, Vogels, & Orban, 2004; Murphy, Ban, & Welchman, 2013; Tsutsui, Jiang, Yara, Sakata, & Taira, 2001; Welchman, Deubelius, Conrad, Bulthoff, & Kourtzi, 2005). 
It has been suggested that integration of the information from these areas generates a coherent perception of the 3-D structure of an object (Ban et al., 2012; Dovencioglu, Ban, Schofield, & Welchman, 2013; Farivar, 2009; Konen & Kastner, 2008). It is not clear to what extent the visual system segregates the information from different depth cues and to what extent it integrates this information to produce the perception of the depth of an object. This integration can be explained by two competing hypotheses—that the information from different depth cues combines into a single representation in a population of neurons that together encode the depth profile of the object (integration hypothesis), or that the information from different depth cues remains segregated across multiple neural populations that independently represent objects defined by each depth cue (independence hypothesis). We sought to test these two hypotheses by measuring the transfer of neural adaptation across depth cues at the level of the visual system where the M170 response originates. 
Neural adaptation is a phenomenon whereby the repeated presentation of a stimulus results in a diminished response in the same neural population (Brown & Xiang, 1998; Desimone, 1996). This effect has been described with functional magnetic resonance imaging (Grill-Spector, Kushnir, Edelman, Itzchak, & Malach, 1998; Henson, 2003), electroencephalography (EEG; Heisz, Watter, & Shedden, 2006; Kovacs et al., 2006; Retter & Rossion, 2016), and magnetoencephalography (MEG; Harris & Nakayama, 2007, 2008; Simpson et al., 2015). Rapid event-related adaptation MEG paradigms have been used to address the temporal properties of shape processing in the visual system (Huberle & Lutzenberger, 2013; Scholl, Jiang, Martin, & Riesenhuber, 2014). Such an approach has also been used in face-recognition studies because of the adaptation effects that can be robustly measured with both MEG (Kietzmann, Ehinger, Porada, Engel, & Konig, 2016) and EEG (Caharel, d'Arripe, Ramon, Jacques, & Rossion, 2009; Vizioli, Rousselet, & Caldara, 2010). Based on the integration hypothesis, if the information from different depth cues taps into the same neural population, the responsiveness of that population should decrease upon the repetition of the stimulus regardless of the type of depth cue. Therefore, at the level of processing at which M170 occurs, cross-cue adaptation would be evidence for cue-invariant representations, whereas if different depth cues engage distinct neural responses, no cross-cue adaptation would have occurred. 
We used MEG to measure the spatiotemporal response to a shaded face when preceded by the same or different categories of objects and the same or different types of depth cues. The peak of the evoked potential around 170 ms after stimulus onset that can be recorded in EEG (N170) and MEG (M170) is reported to be significantly higher in response to face stimuli than to nonface controls (Bentin, Allison, Puce, Perez, & McCarthy, 1996; Jeffreys, 1996; J. Liu, Higuchi, Marantz, & Kanwisher, 2000; Sams, Hietanen, Hari, Ilmoniemi, & Lounasmaa, 1997). This evoked response, often localized to the fusiform face area and the posterior superior temporal sulcus (Henson et al., 2003; Horovitz, Rossion, Skudlarski, & Gore, 2004; Sadeh, Podlipsky, Zhdanov, & Yovel, 2010), is sensitive to adaptation because the presence of a face adaptor causes a reduced and delayed M170 response. 
We reasoned that face representations that are invariant with regard to depth cues (Dehmoobadsharifabadi & Farivar, 2016) would elicit cross-cue adaptation—that the M170 would be attenuated for a face stimulus if preceded by another face stimulus regardless of the depth-cue mismatch between the two. We conservatively constrained the test stimulus to always be a shaded face and compared the response of the M170 to this shaded face when the adapting stimuli were faces, face-like control surfaces, or chairs defined purely by individual depth cues—shading, SFM, and textures, as in our previous work (Akhavein & Farivar, 2017; Dehmoobadsharifabadi & Farivar, 2016; Farivar, Blanke, & Chaudhuri, 2009). We observed robust cross-cue adaptation related to face and object recognition in at least one category-selective region of interest, suggesting that depth-cue-invariant object representations can arise in higher levels of the visual system. 
Methods
Subjects
Eleven healthy subjects (eight men, three women) participated in the study. Data quality control resulted in the exclusion of two, resulting in nine subjects (six men, three women) in the final analysis. The subjects all had anatomical MRI scans, and these were used for their surface reconstruction. All subjects had normal or corrected-to-normal vision. Subjects were tested prior to the main experiment to have normal stereovision and to be able to categorize all objects used in our study. This study was approved by the Montreal Neurological Institute Ethics Board. 
General design
We used a double-pulse presentation paradigm as described by Jeffreys (1996) to study the response of M170 after adaptation. In each trial, two stimulus epochs, termed S1 and S2, were presented in succession with a short interstimulus interval. The S1 stimulus consisted of a face, a chair, or a control face (a “cFace”) defined by one of the three depth cues (shading, texture, SFM1). The S2 stimulus was always a shaded face whose identity varied on each trial. In 10% of the trials, S2 was a target shaded face that the subjects had to detect. The purpose of the task was to increase the subject's attention; the data from these target-present trials were excluded from all analysis. The S1 epoch lasted 1,133 ms (68 refreshes of the screen refreshing at 60 Hz), and the durations of S2 and the interstimulus interval were set to 200 ms (12 refreshes) and 100 ms (six refreshes), respectively. There was a random pause of 1,200 ± 200 ms between trials (intertrial interval), and a red fixation point was presented at the center of the screen during the entire experimental run. Each run consisted of 10 pairs of conditions (3 depth cues × 3 object categories plus 1 target-present condition which was removed from analysis), and each condition was repeated 20 times (Figure 1). There were five recording sessions each lasting about 10 min. 
Figure 1
 
Schematic view of the experimental design. The S1 stimulus could be a face, a chair, or a cFace defined by one of the depth cues (shading, texture, structure from motion). The S2 stimulus was always a shaded face of a different identity.
Figure 1
 
Schematic view of the experimental design. The S1 stimulus could be a face, a chair, or a cFace defined by one of the depth cues (shading, texture, structure from motion). The S2 stimulus was always a shaded face of a different identity.
Stimuli
The stimuli depicted three object categories: faces, chairs, and cFaces. Forty-one synthetic 3-D facial surfaces (20 × S1, 20 × S2, and one target) were generated with random identities using FaceGen Modeller 3.5 software (Singular Inversions, Inc., Toronto, Canada). Chairs were obtained from online open-access libraries in .3ds format and later modified in Autodesk 3ds Studio Max 2013. Twenty chairs containing smooth surfaces were chosen. Twenty cFaces were generated by removing the internal facial features of the face but keeping the contour intact using Autodesk 3ds Studio Max—the internal features of the face were cut out of the facial surface, surface holes were then capped, and the final surfaces were smoothed. cFaces thus had the same contour and average overall facial depth, but no internal features. All object surfaces were later rendered to generate isolated depth cues as described later and following our previous work (Akhavein & Farivar, 2017; Dehmoobadsharifabadi & Farivar, 2016; Farivar et al., 2009). 
Shading
All the textures were removed from the surface, and a directional light source was introduced at a 45° angle from the horizon. The frontal view of the face was rendered in orthographic projection to avoid perspective information (Figure 2a), to make this consistent across all conditions. 
Figure 2
 
Examples of (a) shading-defined and (b) texture-defined stimuli used.
Figure 2
 
Examples of (a) shading-defined and (b) texture-defined stimuli used.
Texture stimuli
Dot textures were added to the facial surface after the glossiness and soften levels were removed. Self-illumination was applied at 100% to remove shading and shadows, and the final surfaces were rendered with orthographic projection to remove perspective. This process resulted in 3-D surfaces that were defined solely by texture gradients (Figure 2b). 
SFM
Twenty-four thousand dots were projected onto the surface of the objects in spatially uniform random positions. In each frame, the object rotated 0.5° in depth (around the vertical axis) from +4° to −4°. On each frame, the dots were projected back to the 2-D image plane and the 2-D dot density was calculated, and dots were reshuffled between high- and low-density regions to ensure uniform dot density throughout the presentation. On average, 200 dots were shuffled on each frame to compensate for local density changes caused by the 0.5° change in depth. To increase spatial sampling of the object and improve the quality of the SFM stimuli, we generated four independently estimated dot positions over time and overlaid them for each stimulus. This resulted in a variety of dot luminance intensities but more spatial samples than simple white on black. 
All stimuli were presented on an LG 3-D LED widescreen monitor (D2342PY, height: 13.5 in.) placed approximately 1.7 m away from the subject's head in the MEG room and covering 9.6° of visual angle. 
Data acquisition
MEG recordings were done using a 275-channel CTF/VSM instrument with axial gradiometers based on a superconducting quantum interference device. A third-order special filter was applied for noise compensation. Magnetic brain activity was digitized at the sampling rate of 2400 Hz, and an antialiasing filter of 600 Hz was applied at the time of recording. Additional electrophysiological signals were recorded using bipolar derivations. Two electrodes were placed on the torso for electrocardiographic cardiac activity, two were placed at the left eye (one above and one below) to capture electrooculographic blink activity, and two were placed one on either side of the eyes to capture electrooculographic saccade activity. In addition, a ground electrode was placed on the left shoulder. Head-position tracking coils were placed on the subject and their locations were digitized using a Fastrak Polhemus device. Additional scalp points were also recorded to capture the subject's head shape for coregistration with the T1 anatomical scan. A photodiode was placed on the backside of the screen-capture stimulus triggers for precise timing of visual presentation and was recorded in an additional analog channel with the MEG. 
Preprocessing
FreeSurfer was used for 3-D surface reconstruction of the anatomical MRI. We used Brainstorm 3.1 (Tadel, Baillet, Mosher, Pantazis, & Leahy, 2011) for preprocessing of the MEG signal and inverse projection onto the cortical surface models. The reconstructed surfaces were registered using the images of the head-position tracking coils during head digitization. The recordings were band-pass filtered at 0.8–200 Hz, and a notch filter was applied to remove 60-Hz harmonics (power-line contaminations). Bad channels were determined based on power spectrum density (Welch filter) from recordings and eliminated from further analysis. Signal-space projection was used to remove the artifacts generated by eyeblinks and heartbeats based on electrooculograph and electrocardiograph activity recorded during the experiment. Independent component analysis with 60 components using the infomax algorithm was used to detect the artifacts generated by the noise of the LCD screen, and the corresponding components were removed from the signals. Bad segments and trials with noisy recordings due to movements or large alpha-wave amplitudes were eliminated from further analysis. 
Noise covariance was computed from 5 min. of empty room recording while the experiment was running on the screen. For each run, the head model was computed using the overlapping-spheres approach (Huang, Mosher, & Leahy, 1999), and the sources were estimated using minimum norm imaging with constrained orientation of the dipoles to be normal to the cortex surface. The sources were averaged across all the trials for each condition and projected to the Colin27 anatomy template (Holmes et al., 1998). To effectively characterize the M170 activity, the average source activity for each condition was low-pass filtered at 20 Hz and z-score normalized based on 500 ms of pre-S1 baseline. 
Analysis
The amplitude and latency of M170 were extracted based on the strongest peak of the response within 130–200 ms after the stimulus onset in S2 and the S1 shaded face. For each subject, the adaptation effect was quantified as the ratio of the amplitude of the M170 peak in S2 versus the S1 shaded-face condition. The delay of the response to adaptation was also extracted from the difference between the latency of M170 in S2 and the S1 shaded face. Multivariate (Hotelling T2) and univariate (repeated-measures analysis of variance [ANOVA]) analyses were performed to compare conditions. All statistical analyses were performed using SPSS (v23.0; IBM Corp, Armonk, NY) and MATLAB (MathWorks, Natick, MA). 
Results
Regions of interest (ROIs)
Previous neuroimaging studies have suggested a network of cortical areas engaged in the process of face perception (Haxby, Hoffman, & Gobbini, 2000), including regions along the occipital gyrus, the superior temporal sulcus (Hoffman & Haxby, 2000; Puce, Allison, Bentin, Gore, & McCarthy, 1998), and the lateral fusiform gyrus (George et al., 1999; Hoffman & Haxby, 2000; Sergent, Ohta, & MacDonald, 1992). We defined our ROIs based on the contrast between the grand average responses to shaded faces and shaded chairs (Figure 3) in S1—thus they were based on data that were independent of our S2 measurements. The average waveform was defined as the mean signal of the nodes within the region of interest. The areas with the peak response amplitude (around 170 ms) of more than two standard deviations from the baseline in the Face vs. Chair conditions were included in the ROIs. Four clusters of activity were detected—two in the fusiform gyrus and two in the lateral occipital cortex. Although it has been proposed that the M170 response arises through activity in the fusiform face area (Kanwisher, McDermott, & Chun, 1997), our results revealed two other cortical regions in the occipital lobe which are in close proximity to what has been previously described as the occipital face area using fMRI (Kanwisher et al., 1997; McCarthy, Puce, Gore, & Allison, 1997; Puce, Allison, Asgari, Gore, & McCarthy, 1996). 
Figure 3
 
The region of interests extracted based on the face/chair contrast in S1 (left) and the grand average M170 activity in response to S1 face and S1 chair (right).
Figure 3
 
The region of interests extracted based on the face/chair contrast in S1 (left) and the grand average M170 activity in response to S1 face and S1 chair (right).
M170 adaptation effect
The M170 amplitude and latency were extracted for individual subjects by sampling the strongest peak of the signal between 130 and 200 ms. The reader will note that the S2 stimulus was always a shaded face with a random identity, and thus the S2 adaptation across conditions was conservatively controlled. S2 effects were normalized with respect to S1 shaded-face amplitude and onset for each subject and compared across different S1 adaptor conditions. Figures 4 and 5 depict the normalized S2 response amplitude and delay as a function of object category across each depth cue. 
Figure 4
 
The M170 amplitude of S2 as a ratio of the S1 shaded face in different regions of interest when the S1 stimuli was either a face, a chair, or a cFace defined by different depth cues. The error bars reflect standard errors across subjects, displayed for illustration only—the error terms used for the inferential statistics reflected within-subject error.
Figure 4
 
The M170 amplitude of S2 as a ratio of the S1 shaded face in different regions of interest when the S1 stimuli was either a face, a chair, or a cFace defined by different depth cues. The error bars reflect standard errors across subjects, displayed for illustration only—the error terms used for the inferential statistics reflected within-subject error.
Figure 5
 
Delay of the M170 peak response in S2 compared to S1 shaded face. Error bars reflect standard error across subjects, displayed for illustration only—the error terms used for the inferential statistics reflected within-subject error.
Figure 5
 
Delay of the M170 peak response in S2 compared to S1 shaded face. Error bars reflect standard error across subjects, displayed for illustration only—the error terms used for the inferential statistics reflected within-subject error.
Previous studies have reported the adaptation effect as both a reduced and a delayed response of M170 (J. Liu et al., 2000; Simpson et al., 2015). We therefore carried out a multivariate paired-sample Hotelling T2 test to simultaneously assess the relative changes of normalized amplitude and delay between Face Versus Chair and Face Versus cFace in each depth cue and each ROI. The Hotelling T2 results are summarized in Table 1. The effect was significant for shading in both Face Versus Chair and Face Versus cFace conditions in all the ROIs. Shading was expected to show the strongest adaptation effect because in this condition, the same depth cue was presented in S1 and S2. The strong adaptation effect in shading could also be due to the way the different ROIs were defined. Since the ROIs were extracted based on contrast in shading-defined stimuli, it is not surprising that we detect the largest adaptation effect in the shading condition. The adaptation effect in shading could also be due to low-level features such as edge contrasts rather than to the depth information. This effect is more prominent in Face Versus Face likely because, despite the differences in identities in S1 and S2, the edges are roughly in the same position. 
Table 1
 
Cross-cue adaptation: Summary of the Hotelling T2 (p) statistical test of the significance of changes in the M170 response in S2 in cross-cue adaptation conditions. Note: Bold formatting indicates a significant effect of adaptation.
Table 1
 
Cross-cue adaptation: Summary of the Hotelling T2 (p) statistical test of the significance of changes in the M170 response in S2 in cross-cue adaptation conditions. Note: Bold formatting indicates a significant effect of adaptation.
But these concerns would only apply to the shading condition. We are not suggesting that the shading effect is representative of the depth-cue invariance that we seek to highlight—namely that, as a positive control, we did observe M170 adaptation when the stimulus conditions were similar to previous studies (i.e., similar depth cue, with a substantial amount of edge information) Meanwhile, the effect of low-level features on adaptation is minimal in cross-depth-cue conditions like texture and SFM. Another explanation for the strong effect of adaptation in the shading condition could be the contribution of low-level features on the adaptation effect in case of differences in distribution of the spatial frequencies. Faces and chairs have higher spatial-frequency contents than the cFace stimuli, but our results suggest that the difference between chairs and cFaces is marginal in the shading condition. This effect has been reported before by Harris and Nakayama (2008), who found minimal adaptation in face contour. They suggested that the M170 response is mainly due to the response to internal features of the face and that other face configurations cannot generate M170 adaptation any better than house stimuli. 
Cross-cue adaptation was evident for SFM- and texture-defined faces in the left fusiform area (Table 1). Adaptation for faces was significantly stronger than for chairs in the right fusiform area as well, but comparable to cFaces in this ROI. The effect on the occipital areas was not as prominent as on the fusiform regions. Only the left occipital area showed a cross-depth adaptation effect for textured stimuli, which was also restricted to the Face Versus cFace contrast. 
To further characterize the adaptation effects, we carried out univariate repeated-measure factorial ANOVA on the amplitude and delay estimates of the M170 adaptation separately. 
Amplitude adaptation
For amplitude variations, the ratio of S2 amplitude to S1 amplitude in each ROI was analyzed separately, with depth cue (texture, SFM) and object category (faces, chairs, cFaces) as within-subject factors. The responses to shaded stimuli have been removed in this stage of the analysis to derive inferences regarding only cross-depth-cue adaptation effects. There was a main effect of object category in the fusiform areas, all F(2, 16) > 4.349, p < 0.031, with no significant Depth cue × Object category interaction, all F(2, 16) < 1.790, p > 0.199; but the effect was not significant in the occipital regions, all F(2, 18) < 1.872, p > 0.186. 
Latency adaptation
As expected, in within-depth-cue conditions the effect on adaptation was prominent in all four regions. In the shading condition, the effect was prominent in both Face Versus Chair and Face Versus cFace contrasts—in both conditions and all ROIs, t(8) > 2.92, p < 0.019—which means the latency of the signal is greater when S1 is a shaded face compared to either of the two other categories. 
In cross-depth-cue conditions, the delay of the M170 activity exhibited robust sensitivity in response to adaptation. Similar to the amplitude adaptation—excluding the Shading condition—the same analysis was performed on the delay of the response of M170 in S2 compared to S1 (latency of the peak of S2 minus the peak of S1 shaded face). A repeated-measures ANOVA was performed on the delay of M170 response with depth cue (texture, SFM) and object category (faces, chairs, cFaces) as factors. There was a main effect of object category in the fusiform areas and the left occipital region, all F(2, 16) > 4.46, p < 0.029, which suggests a significant change in the latency of M170 in S2 based on the S1 object category in these regions. The main effect was not significant in the right occipital area, F(2, 16) = 1.309, p = 0.297. 
Face versus cFace
Faces and cFaces contain similar external contours, and the adaptation effect could potentially rise from the external features of the object. If an ROI exhibits a significantly different adaptation to faces than to their controls—despite differences in depth cues and high similarity between the two stimuli—then we can infer that the ROI is representing the faces selectively and in a depth-cue-invariant manner, despite the highly conservative contrast. 
We first directly compared the strength of adaptation between face and cFace in cross-depth-cue conditions using repeated-measures ANOVA with object category (faces, cFaces) and depth cue (texture, SFM) as factors for each ROI separately. The same analysis was performed separately for amplitude and latency. There was main effect of object category in the left fusiform area for both amplitude, F(1, 8) = 15.422, p = 0.004, and latency, F(1, 8) = 27.696, p = 0.001, with no significant Depth cue × Object category interaction, both F(1, 8) < 2.705, p > 0.139. This main effect could not be seen in any other ROI for either the amplitude or the latency of the signal, all F(1, 8) < 0.880, p > 0.373.The only exception was the latency of the signal in the left occipital region, F(1, 8) = 6.561, p = 0.034. 
Discussion
Our results from the M170 adaptation paradigm support the notion that faces (and possibly other objects) are represented in certain category-selective areas in a depth-cue-invariant manner (Dehmoobadsharifabadi & Farivar, 2016; Farivar et al., 2009). We observed stronger S2 adaptation when S1 was a face rather than a chair, and this effect was observed irrespective of whether the S1 face was defined by shading, texture, or SFM. As expected from previous studies, the adaptation effect was present as weakened and delayed M170 activity in S2, and jointly accounting for both measures using multivariate statistics revealed adaptation effects that are invariant with respect to depth cue. Regional variations were observed in the strength of the M170 cross-cue adaptation, with adaptation being strongest in the left fusiform area, marginal in the right hemisphere, and almost negligible in the occipital regions. 
We sought to maximally control the potential for extraneous variables contributing to the M170 adaptation, and thus chose to use the same stimulus—a shaded face of random identity—as the S2 stimulus. This ensured that any observed effects were due only to the changes in S1, and no feature of the S2 stimulus itself could have contributed to modulations of the S2-induced M170. While we believe that the M170 adaptation effects ought to generalize to other stimuli used as S2, our choice of a restricted S2 stimulus paradigm precludes such a discussion. 
A critical feature of our study was the use of stimuli defined by single depth cues. We took great care to ensure that stimuli defined by one depth cue did not have any contaminating artifacts from other depth cues, following our previous work (Akhavein & Farivar, 2017; Dehmoobadsharifabadi & Farivar, 2016; Farivar et al., 2009). But binocular viewing of a 2-D screen always introduces some level of conflict between disparity and other depth cues. The viewing distance can also affect the reliability of some depth cues, such as disparity (Johnston, Cumming, & Parker, 1993). Since the viewing distance in our experiment was ∼170 cm, which is relatively large for vision experiments, we expect to have a minimal effect of disparity conflict with other depth cues. The large viewing distance might be another reason that some subjects were unable to perceive the disparity-defined stimuli, leading us to discard those data sets. While the stimuli were rendered using single depth cues, how they were rendered for each depth cue could have varied—for example, different texture patterns could be used to render the texture-condition stimuli, but we rendered them with only one texture. It is possible that the strength of S1 adaptation would be affected by how the individual depth cues are rendered—certain textures exhibiting very high spatial frequencies tend to make the depth profile of the stimulus less clear (Rosenholtz & Malik, 1997). It will be of great value to assess how differences in depth-cue rendering affect depth perception and whether the effects translate to high-level markers of complex vision, like the M170. 
Relatedly, while M170 adaptation effects were observed for faces regardless of the depth cue, the magnitude of the M170 adaptation effect was not identical across the different depth cues. It is difficult to assess whether this difference was due to how the stimuli were rendered across the different depth cues or some intrinsic quality of the depth cue itself. More specifically, it is possible that certain depth cues result in a more robust 3-D representation or are more salient, but without matching the low-level features of the stimuli across the different depth cues, such a comparison would be difficult. Other studies have also reported that about half the cells in the temporal regions of monkeys do not show depth-cue-invariant responses (Y. Liu et al., 2004; Sary, Vogels, & Orban, 1993). Therefore, the weaker M170 adaptation effect could be due to the proportion of the cells in the ROI which are tuned only to shading stimuli. This could explain the differences in the strength of the adaptation effect in different depth cues. Our results suggest that the M170 adaptation is transferable across depth cues, and a comparison of the strength of the depth cues is beyond the scope of our study. 
To control for coarse shape effects, we created meaningless objects that had the same overall contour and average curvature as faces but no internal features. These cFace stimuli resulted in M170 adaptation effects comparable to those of faces in all ROIs except the left fusiform, and in general M170 adaptation was greater for cFaces than for chairs. It is likely that the cFaces contain much of the same information as a face, and thus a portion of the M170 is reflective of this coarse 3-D contour of a face. It is also equally possible that internal features of faces are not well represented in SFM or texture renderings, but we do not believe this to be likely. Subjective experience with the stimuli, and our previous results (Dehmoobadsharifabadi & Farivar, 2016; Farivar et al., 2009), suggest that internal features are effectively represented by SFM. The likely explanation is therefore that the M170 is in part driven by the coarse curvature of a face-like object. 
Different depth cues appear to be processed by separate visual areas. For example, dorsal areas such as MT/V5 are significantly more involved in the processing of motion (Paradis et al., 2000; Tootell et al., 1995; Zeki et al., 1991), and MT in particular appears to have an important role in extraction of structure from motion (Grunewald, Bradley, & Andersen, 2002). Texture-defined objects activate both dorsal and ventral areas like the caudal inferior temporal gyrus, lateral occipital sulcus, and intraparietal sulcus, while shaded objects activate primarily the caudal inferior temporal gyrus (Georgieva, Todd, Peeters, & Orban, 2008; Shikata et al., 2001). These reports highlight the segregated neural populations that are engaged for processing different depth cues. Despite the cortical segregation in the processing of different depth cues, we found that surface representations extracted from each depth cue likely contribute to the same representation in the ventral visual cortex. 
Conclusion
Our results suggest that neural correlates of object representation can exhibit depth-cue invariance. We have previously argued (Dehmoobadsharifabadi & Farivar, 2016; Farivar, 2009; Farivar et al., 2009) that depth-cue invariance is a fundamental property of object representations, alongside size, position, and viewpoint invariance. The M170 cross-cue adaptation effect, observed in a highly controlled paradigm using novel stimuli rendered with single depth cues, lends support to the depth-cue invariance of object representations in parts of the ventral visual cortex. 
Acknowledgments
The study was supported by start-up funds from the Research Institute of the McGill University Health Centre to RF and a Natural Sciences and Engineering Research Council Discovery Grant to RF (RGPIN 419235-2013). 
Commercial relationships: none. 
Corresponding author: Reza Farivar. 
Address: McGill Vision Research, Department of Ophthalmology, McGill University, Montreal, Canada. 
References
Akhavein, H., & Farivar, R. (2017). Gaze behavior during 3-D face identification is depth cue invariant. Journal of Vision, 17 (2): 9, 1–12, https://doi.org/10.1167/17.2.9. [PubMed] [Article]
Ban, H., Preston, T. J., Meeson, A., & Welchman, A. E. (2012). The integration of motion and disparity cues to depth in dorsal visual cortex. Nature Neuroscience, 15 (4), 636–643, https://doi.org/10.1038/nn.3046.
Bentin, S., Allison, T., Puce, A., Perez, E., & McCarthy, G. (1996). Electrophysiological studies of face perception in humans. Journal of Cognitive Neuroscience, 8 (6), 551–565, https://doi.org/10.1162/jocn.1996.8.6.551.
Brown, M. W., & Xiang, J. Z. (1998). Recognition memory: Neuronal substrates of the judgement of prior occurrence. Progress in Neurobiology, 55 (2), 149–189.
Caharel, S., d'Arripe, O., Ramon, M., Jacques, C., & Rossion, B. (2009). Early adaptation to repeated unfamiliar faces across viewpoint changes in the right hemisphere: Evidence from the N170 ERP component. Neuropsychologia, 47 (3), 639–643, https://doi.org/10.1016/j.neuropsychologia.2008.11.016.
Dehmoobadsharifabadi, A., & Farivar, R. (2016). Are face representations depth cue invariant? Journal of Vision, 16 (8): 6, 1–15, https://doi.org/10.1167/16.8.6. [PubMed] [Article]
Desimone, R. (1996). Neural mechanisms for visual memory and their role in attention. Proceedings of the National Academy of Sciences, USA, 93 (24), 13494–13499.
Dovencioglu, D., Ban, H., Schofield, A. J., & Welchman, A. E. (2013). Perceptual integration for qualitatively different 3-D cues in the human brain. Journal of Cognitive Neuroscience, 25 (9), 1527–1541, https://doi.org/10.1162/jocn_a_00417.
Farivar, R. (2009). Dorsal-ventral integration in object recognition. Brain Research Reviews, 61 (2), 144–153, https://doi.org/10.1016/j.brainresrev.2009.05.006.
Farivar, R., Blanke, O., & Chaudhuri, A. (2009). Dorsal-ventral integration in the recognition of motion-defined unfamiliar faces. The Journal of Neuroscience, 29 (16), 5336–5342, https://doi.org/10.1523/jneurosci.4978-08.2009.
George, N., Dolan, R. J., Fink, G. R., Baylis, G. C., Russell, C., & Driver, J. (1999). Contrast polarity and face recognition in the human fusiform gyrus. Nature Neuroscience, 2 (6), 574–580, https://doi.org/10.1038/9230.
Georgieva, S. S., Todd, J. T., Peeters, R., & Orban, G. A. (2008). The extraction of 3D shape from texture and shading in the human brain. Cerebral Cortex, 18 (10), 2416–2438, https://doi.org/10.1093/cercor/bhn002.
Grill-Spector, K., Kushnir, T., Edelman, S., Itzchak, Y., & Malach, R. (1998). Cue-invariant activation in object-related areas of the human occipital lobe. Neuron, 21 (1), 191–202.
Grunewald, A., Bradley, D. C., & Andersen, R. A. (2002). Neural correlates of structure-from-motion perception in macaque V1 and MT. The Journal of Neuroscience, 22 (14), 6195–6207, https://doi.org/10.1523/JNEUROSCI.22-14-06195.2002.
Harris, A., & Nakayama, K. (2007). Rapid face-selective adaptation of an early extrastriate component in MEG. Cerebral Cortex, 17 (1), 63–70, https://doi.org/10.1093/cercor/bhj124.
Harris, A., & Nakayama, K. (2008). Rapid adaptation of the m170 response: Importance of face parts. Cerebral Cortex, 18 (2), 467–476, https://doi.org/10.1093/cercor/bhm078.
Haxby, J. V., Hoffman, E. A., & Gobbini, M. I. (2000). The distributed human neural system for face perception. Trends in Cognitive Sciences, 4 (6), 223–233.
Heeger, D. J., Boynton, G. M., Demb, J. B., Seidemann, E., & Newsome, W. T. (1999). Motion opponency in visual cortex. The Journal of Neuroscience, 19 (16), 7162–7174.
Heisz, J. J., Watter, S., & Shedden, J. M. (2006). Automatic face identity encoding at the N170. Vision Research, 46 (28), 4604–4614, https://doi.org/10.1016/j.visres.2006.09.026.
Henson, R. N. (2003). Neuroimaging studies of priming. Progress in Neurobiology, 70 (1), 53–81.
Henson, R. N., Goshen-Gottstein, Y., Ganel, T., Otten, L. J., Quayle, A., & Rugg, M. D. (2003). Electrophysiological and haemodynamic correlates of face perception, recognition and priming. Cerebral Cortex, 13 (7), 793–805.
Hoffman, E. A., & Haxby, J. V. (2000). Distinct representations of eye gaze and identity in the distributed human neural system for face perception. Nature Neuroscience, 3 (1), 80–84, https://doi.org/10.1038/71152.
Holmes, C. J., Hoge, R., Collins, L., Woods, R., Toga, A. W., & Evans, A. C. (1998). Enhancement of MR images using registration for signal averaging. Journal of Computer Assisted Tomography, 22 (2), 324–333.
Horovitz, S. G., Rossion, B., Skudlarski, P., & Gore, J. C. (2004). Parametric design and correlational analyses help integrating fMRI and electrophysiological data during face processing. NeuroImage, 22 (4), 1587–1595, https://doi.org/10.1016/j.neuroimage.2004.04.018.
Huang, M. X., Mosher, J. C., & Leahy, R. M. (1999). A sensor-weighted overlapping-sphere head model and exhaustive head model comparison for MEG. Physics in Medicine & Biology, 44 (2), 423–440.
Huberle, E., & Lutzenberger, W. (2013). Temporal properties of shape processing by event-related MEG adaptation. NeuroImage, 67, 119–126, https://doi.org/10.1016/j.neuroimage.2012.10.070.
Jeffreys, D. A. (1996). Evoked potential studies of face and object processing. Visual Cognition, 3 (1), 1–38, https://doi.org/10.1080/713756729.
Johnston, E. B., Cumming, B. G., & Parker, A. J. (1993). Integration of depth modules: Stereopsis and texture. Vision Research, 33 (5–6), 813–826.
Kamitani, Y., & Tong, F. (2006). Decoding seen and attended motion directions from activity in the human visual cortex. Current Biology, 16 (11), 1096–1102, https://doi.org/10.1016/j.cub.2006.04.003.
Kanwisher, N., McDermott, J., & Chun, M. M. (1997). The fusiform face area: A module in human extrastriate cortex specialized for face perception. The Journal of Neuroscience, 17 (11), 4302–4311.
Kietzmann, T. C., Ehinger, B. V., Porada, D., Engel, A. K., & Konig, P. (2016). Extensive training leads to temporal and spatial shifts of cortical activity underlying visual category selectivity. NeuroImage, 134, 22–34, https://doi.org/10.1016/j.neuroimage.2016.03.066.
Konen, C. S., & Kastner, S. (2008). Two hierarchically organized neural systems for object information in human visual cortex. Nature Neuroscience, 11 (2), 224–231, https://doi.org/10.1038/nn2036.
Kovacs, G., Zimmer, M., Banko, E., Harza, I., Antal, A., & Vidnyanszky, Z. (2006). Electrophysiological correlates of visual adaptation to faces and body parts in humans. Cerebral Cortex, 16 (5), 742–753, https://doi.org/10.1093/cercor/bhj020.
Liu, J., Higuchi, M., Marantz, A., & Kanwisher, N. (2000). The selectivity of the occipitotemporal M170 for faces. NeuroReport, 11 (2), 337–341.
Liu, Y., Vogels, R., & Orban, G. A. (2004). Convergence of depth from texture and depth from disparity in macaque inferior temporal cortex. The Journal of Neuroscience, 24 (15), 3795–3800, https://doi.org/10.1523/jneurosci.0150-04.2004.
McCarthy, G., Puce, A., Gore, J. C., & Allison, T. (1997). Face-specific processing in the human fusiform gyrus. Journal of Cognitive Neuroscience, 9 (5), 605–610, https://doi.org/10.1162/jocn.1997.9.5.605.
Merigan, W. H. (2000). Cortical area V4 is critical for certain texture discriminations, but this effect is not dependent on attention. Visual Neuroscience, 17 (6), 949–958.
Merigan, W. H., & Pham, H. A. (1998). V4 lesions in macaques affect both single- and multiple-viewpoint shape discriminations. Visual Neuroscience, 15 (2), 359–367.
Murphy, A. P., Ban, H., & Welchman, A. E. (2013). Integration of texture and disparity cues to surface slant in dorsal visual cortex. Journal of Neurophysiology, 110 (1), 190–203, https://doi.org/10.1152/jn.01055.2012.
Paradis, A. L., Cornilleau-Peres, V., Droulez, J., Van De Moortele, P. F., Lobel, E., Berthoz, A.,… Poline, J. B. (2000). Visual perception of motion and 3-D structure from motion: An fMRI study. Cerebral Cortex, 10 (8), 772–783.
Puce, A., Allison, T., Asgari, M., Gore, J. C., & McCarthy, G. (1996). Differential sensitivity of human visual cortex to faces, letterstrings, and textures: A functional magnetic resonance imaging study. The Journal of Neuroscience, 16 (16), 5205–5215.
Puce, A., Allison, T., Bentin, S., Gore, J. C., & McCarthy, G. (1998). Temporal cortex activation in humans viewing eye and mouth movements. The Journal of Neuroscience, 18 (6), 2188–2199.
Retter, T. L., & Rossion, B. (2016). Visual adaptation provides objective electrophysiological evidence of facial identity discrimination. Cortex, 80, 35–50, https://doi.org/10.1016/j.cortex.2015.11.025.
Rosenholtz, R., & Malik, J. (1997). Surface orientation from texture: Isotropy or homogeneity (or both)? Vision Research, 37 (16), 2283–2293.
Sadeh, B., Podlipsky, I., Zhdanov, A., & Yovel, G. (2010). Event-related potential and functional MRI measures of face-selectivity are highly correlated: A simultaneous ERP-fMRI investigation. Human Brain Mapping, 31 (10), 1490–1501, https://doi.org/10.1002/hbm.20952.
Sams, M., Hietanen, J. K., Hari, R., Ilmoniemi, R. J., & Lounasmaa, O. V. (1997). Face-specific responses from the human inferior occipito-temporal cortex. Neuroscience, 77 (1), 49–55.
Sary, G., Vogels, R., & Orban, G. A. (1993, May 14). Cue-invariant shape selectivity of macaque inferior temporal neurons. Science, 260 (5110), 995–997.
Scholl, C. A., Jiang, X., Martin, J. G., & Riesenhuber, M. (2014). Time course of shape and category selectivity revealed by EEG rapid adaptation. Journal of Cognitive Neuroscience, 26 (2), 408–421, https://doi.org/10.1162/jocn_a_00477.
Sergent, J., Ohta, S., & MacDonald, B. (1992). Functional neuroanatomy of face and object processing: A positron emission tomography study. Brain, 115 (1), 15–36.
Shikata, E., Hamzei, F., Glauche, V., Knab, R., Dettmers, C., Weiller, C., & Büchel, C. (2001). Surface orientation discrimination activates caudal and anterior intraparietal sulcus in humans: An event-related fMRI study. Journal of Neurophysiology, 85 (3), 1309–1314.
Simpson, M. I., Johnson, S. R., Prendergast, G., Kokkinakis, A. V., Johnson, E., Green, G. G., & Johnston, P. J. (2015). MEG adaptation resolves the spatiotemporal characteristics of face-sensitive brain responses. The Journal of Neuroscience, 35 (45), 15088–15096, https://doi.org/10.1523/jneurosci.2090-15.2015.
Tadel, F., Baillet, S., Mosher, J. C., Pantazis, D., & Leahy, R. M. (2011). Brainstorm: A user-friendly application for MEG/EEG analysis. Computational Intelligence and Neuroscience, 2011, 879716, https://doi.org/10.1155/2011/879716.
Tootell, R. B., Reppas, J. B., Kwong, K. K., Malach, R., Born, R. T., Brady, T. J.,… Belliveau, J. W. (1995). Functional analysis of human MT and related visual cortical areas using magnetic resonance imaging. The Journal of Neuroscience, 15 (4), 3215–3230.
Tsutsui, K., Jiang, M., Yara, K., Sakata, H., & Taira, M. (2001). Integration of perspective and disparity cues in surface-orientation-selective neurons of area CIP. Journal of Neurophysiology, 86 (6), 2856–2867.
Vizioli, L., Rousselet, G. A., & Caldara, R. (2010). Neural repetition suppression to identity is abolished by other-race faces. Proceedings of the National Academy of Sciences, USA, 107 (46), 20081–20086, https://doi.org/10.1073/pnas.1005751107.
Welchman, A. E., Deubelius, A., Conrad, V., Bulthoff, H. H., & Kourtzi, Z. (2005). 3D shape perception from combined depth cues in human visual cortex. Nature Neuroscience, 8 (6), 820–827, https://doi.org/10.1038/nn1461.
Zeki, S., Watson, J. D., Lueck, C. J., Friston, K. J., Kennard, C., & Frackowiak, R. S. (1991). A direct demonstration of functional specialization in human visual cortex. The Journal of Neuroscience, 11 (3), 641–649.
Footnotes
1  We had also included stimuli defined by binocular disparity that were viewed through a pair of polarized goggles, but a number of subjects reported after the recording session that they were unable to perceive the disparity-defined S1 stimuli for most of the trials, a problem attributed to the goggles fogging-up. This was not evident in our pilot testing but compelled us to exclude the disparity data from further analysis.
Figure 1
 
Schematic view of the experimental design. The S1 stimulus could be a face, a chair, or a cFace defined by one of the depth cues (shading, texture, structure from motion). The S2 stimulus was always a shaded face of a different identity.
Figure 1
 
Schematic view of the experimental design. The S1 stimulus could be a face, a chair, or a cFace defined by one of the depth cues (shading, texture, structure from motion). The S2 stimulus was always a shaded face of a different identity.
Figure 2
 
Examples of (a) shading-defined and (b) texture-defined stimuli used.
Figure 2
 
Examples of (a) shading-defined and (b) texture-defined stimuli used.
Figure 3
 
The region of interests extracted based on the face/chair contrast in S1 (left) and the grand average M170 activity in response to S1 face and S1 chair (right).
Figure 3
 
The region of interests extracted based on the face/chair contrast in S1 (left) and the grand average M170 activity in response to S1 face and S1 chair (right).
Figure 4
 
The M170 amplitude of S2 as a ratio of the S1 shaded face in different regions of interest when the S1 stimuli was either a face, a chair, or a cFace defined by different depth cues. The error bars reflect standard errors across subjects, displayed for illustration only—the error terms used for the inferential statistics reflected within-subject error.
Figure 4
 
The M170 amplitude of S2 as a ratio of the S1 shaded face in different regions of interest when the S1 stimuli was either a face, a chair, or a cFace defined by different depth cues. The error bars reflect standard errors across subjects, displayed for illustration only—the error terms used for the inferential statistics reflected within-subject error.
Figure 5
 
Delay of the M170 peak response in S2 compared to S1 shaded face. Error bars reflect standard error across subjects, displayed for illustration only—the error terms used for the inferential statistics reflected within-subject error.
Figure 5
 
Delay of the M170 peak response in S2 compared to S1 shaded face. Error bars reflect standard error across subjects, displayed for illustration only—the error terms used for the inferential statistics reflected within-subject error.
Table 1
 
Cross-cue adaptation: Summary of the Hotelling T2 (p) statistical test of the significance of changes in the M170 response in S2 in cross-cue adaptation conditions. Note: Bold formatting indicates a significant effect of adaptation.
Table 1
 
Cross-cue adaptation: Summary of the Hotelling T2 (p) statistical test of the significance of changes in the M170 response in S2 in cross-cue adaptation conditions. Note: Bold formatting indicates a significant effect of adaptation.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×