Free
Research Article  |   January 2007
Electrophysiological correlates of perceptual reversals for three different types of multistable images
Author Affiliations
Journal of Vision January 2007, Vol.7, 6. doi:https://doi.org/10.1167/7.1.6
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Michael A. Pitts, Janice L. Nerger, Trevor J. R. Davis; Electrophysiological correlates of perceptual reversals for three different types of multistable images. Journal of Vision 2007;7(1):6. https://doi.org/10.1167/7.1.6.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Electrophysiological recordings were made in 21 observers to investigate whether differences in signature components (P1, N1, selection negativity [SN]) would be revealed during perceptual reversals of three different multistable figures. Using a lattice of Necker cubes as a stimulus, J. Kornmeier and M. Bach (2004, 2005) reported differences in P1 amplitudes as well a broad reversal-related negativity occurring 200–400 ms poststimulus. The current study investigated whether these event-related potentials of Necker cube reversals represent general “perceptual switching” mechanisms and would, therefore, be common to other types of multistable figures. Three different types of multistable stimuli were utilized: a modified Rubin's face/vase, a modified Schröder's staircase, and a novel natural stimulus, Lemmo's cheetahs. Results revealed the broad reversal-related negativity for the face/vase and the reversible staircase but not for the cheetahs. This component is comparable to the SN in polarity, latency, and scalp topography. An effect of early visual spatial attention on figure reversals was suggested by an analysis of the occipital P1 and N1 components. The P1, N1, or both were enhanced for trials in which the observer reported perceptual reversals compared with trials in which no reversals were reported for the face/vase and reversible staircase stimuli. These results support a model of multistable perception in which changes in early spatial attention (indicated by P1 and N1 enhancement) modulate perceptual reversals (indicated by the reversal negativity or SN).

Introduction
Multistable visual stimuli, that is, visual images that can be perceived in at least two mutually exclusive ways, offer unique tools for dissociating perceptual from stimulus-driven changes in visual processing. In multistable perception, physical input to the retina remains constant, whereas perceptual interpretations of the ambiguous input alternate or “reverse” between the perceptual possibilities. Although psychologists have been studying multistable perception for well over a century (see Long & Toppino, 2004, for a review), no consensus concerning the underlying mechanisms that influence perceptual reversals has yet been reached. One recent debate has focused on “low-level” versus “high-level” influences (sometimes referred to as “bottom–up” and “top–down” influences). Low-level explanations suggest that reversals may be due to adaptation of feedforward mechanisms. In this model, activity of one mechanism supports one of the two possible percepts, and when fatigued, it gives rise to the opposing (or competing) percept supported by a different mechanism (e.g., Cohen, 1959; Kohler, 1940; Orbach, Ehrlich, & Heath, 1963; Toppino & Long, 1987). Alternatively, high-level explanations suggest that reversals are caused by mechanisms acting in a feedback fashion on lower level sensory mechanisms (e.g., Georgiades & Harris, 1997; Horlitz & O'Leary, 1993; Kawabata, 1986; Leopold, 2003; Meng & Tong, 2004; Pelton & Solley, 1968; Rock, Hall, & Davis, 1994; Shulman, 1993; Struber & Stadler, 1999; Toppino, 2003). Attention-based accounts, for example, propose that high-level cognitive networks shift spatial attention, which then affect the perceptual focus on incoming sensory information leading to perceptual reversals (Leopold & Logothetis, 1999; Slotnick & Yantis, 2005). 
Over the past decade, converging evidence from neurophysiological and neuropsychological studies has begun to elucidate the nature of the underlying physiology of perceptual reversals. For example, recent fMRI studies have shown that increases in neural activity in frontoparietal regions occur for both perceptual reversals of multistable stimuli and for voluntary shifts in spatial attention (Inui et al., 2000; Kleinschmidt, Buchel, Zeki, & Frackowiak, 1998; Slotnick & Yantis, 2005). Similarly, Windmann, Wehrmann, Calabrese, and Gunturkun (2006) reported that patients with frontal lobe damage were unable to intentionally increase the reversal rates of various multistable stimuli, whereas control participants were able to successfully control reversal rates. Earlier neuropsychological studies reached the same conclusion (Ricci & Blundo, 1990) and have even postulated a right frontal lobe lateralization of perceptual switching mechanisms (Meenan & Miller, 1994). Based on neurophysiological evidence, Leopold and Logothetis (1999) developed an environment-exploration theory of perceptual reversals. In this theory, visual attention is continuously redirected to “refresh” perceptual organization and to ensure accurate interpretation of incoming sensory information. 
One of the major challenges in determining the underlying neurophysiological mechanisms responsible for perceptual reversals is the speed in which visual processing occurs. Foxe and Simpson (2002), for example, determined that visual information reaches the primary visual cortex (V1) in less than 56 ms, frontal regions in 80 ms, and feedback circuits that can influence early visual processing in under 100 ms. Currently, fMRI studies can measure, at best, changes in neural activity over a 1,000- to 3,000-ms period (Luck, 2005) and are, thus, too sluggish to measure the postulated mechanisms. Experiments utilizing single-cell recordings, field potential recordings, or both are more temporally and spatially accurate than fMRI but are limited to nonhuman primates. Using these techniques, specific cells in the extrastriate layers of the cortex that change their firing patterns just before a perceptual reversal occurs and other cells that exhibit firing patterns correlated with each of the two possible percepts have been identified (Leopold & Logothetis, 1996, 1999; Logothetis & Schall, 1989; Sheinberg & Logothetis, 1997). Event-related potentials (ERPs), however, offer a unique opportunity to study electrophysiological changes related to perceptual reversals on a comparable time scale to single-cell recordings and have the advantage that they can be easily recorded in awake human subjects. 
Early ERP studies of multistable perception identified a P300-like component related to perceptual reversals (Basar-Eroglu, Struber, Stadler, Kruse, & Basar, 1993; Isoglu-Alkac et al., 1998). The P300 component is known to represent higher level cognitive processes (Luck, 2005; Picton, 1992) and does not seem to be related to early or intermediate-level perceptual processing. In these experiments, static multistable stimuli were presented, and participants were asked to press a button when a perceptual reversal occurred. The EEG recordings were then time-locked to the participant's response, and average ERP waves were derived. A weakness with this method is the variability inherent in the data-averaging technique. Because participants' reaction times to exogenous (and presumably endogenous) changes vary from trial to trial, recordings that are time-locked to the response require looking backward in time for the perceptual event and will most likely obscure any smaller ERP components in the averaging process. Kornmeier and Bach (2004) found that presenting temporally discontinuous (i.e., flashed) stimuli and time locking to stimulus onset produce much sharper and more clearly defined ERP components. In addition, because observers in the study of Kornmeier and Bach reported that perceptual reversals only occurred at stimulus onset, criticisms of the backward averaging technique were mitigated, and the use of the stimulus presentation paradigm was further validated. A similar attempt at utilizing a discontinuous technique had previously been made with multistable stimuli (O'Donnell, Hendler, & Squires, 1988), but interstimulus intervals (ISIs) of 3.3 s proved too long for maintenance of steady reversal rates (also see Leopold, Wilke, Maier, & Logothetis, 2002, for a discussion of long ISI effects). 
By formulating this new paradigm in which perceptual reversals are entrained to stimulus onset, Kornmeier and Bach (2004, 2005) were able to identify two early ERP components related to endogenous perceptual reversals of the Necker cube. They analyzed and described these components by computing difference waves from reversal trials minus stability trials. The largest difference between the two waveforms began at 160 ms poststimulus, peaked at 250 ms, and persisted until about 400 ms. This broad, negative, reversal-related difference was termed the “reversal negativity” (Kornmeier & Bach, 2004). An earlier reversal component was identified in a subsequent study, the “reversal positivity,” and peaked at 120 ms poststimulus (Kornmeier & Bach, 2005). Kornmeier and Bach argued that the ERP traces for perceptual reversals support a low-level (or bottom–up) theory of multistable perception because the differences occurred so early in the waveform. 
The only multistable stimulus used so far under this new discontinuous presentation paradigm has been the Necker cube lattice. Therefore, the primary purpose of the current study was to determine whether these ERP differences are generalizable to different types of multistable perceptual reversals or are particular to Necker cube reversals. Three stimuli were chosen to fulfill this purpose: (1) Schröder's staircase, which elicits similar depth-orientation reversals as the Necker cube; (2) Rubin's face/vase, which elicits figure-ground reversals; and (3) Lemmo's cheetahs, a novel multistable stimulus, which involves figure belongingness reversals of a natural image (used with permission of photographer Gerry Lemmo, 2006). 
The second purpose of the current investigation was to analyze the reversal-related ERPs in such a way that allows comparisons to existing visual ERP research, particularly to studies involving spatial and selective attention (Hillyard & Anllo-Vento, 1998; Hillyard, Vogel, & Luck, 1998; Luck et al., 1994; Mangun, 1995). The reversal positivity identified by Kornmeier and Bach (2005) occurs at the same latency as the visual P1 peak, whereas the reversal negativity, which was likewise identified by them (Kornmeier & Bach, 2004), occurs at the same latencies associated with the selection negativity (SN). Comparisons between these components were made to help further elucidate the neurophysiological mechanisms that underlie multistable perception. 
Methods
Participants
A total of 21 observers (8 male, 13 female; mean age=24 years, age range=19–49) participated in this study as paid volunteers. Eye dominance was determined via simple dichoptic tests, and visual acuity was assessed via a high-contrast Bailey–Lovie acuity chart; only observers with uncorrected or corrected foveal acuities ≤20/40 participated in this experiment (one participant was excluded based on this criterion). All procedures adhered to federal regulations and were approved by the Colorado State University Institutional Review Board; written informed consent was obtained from each observer prior to participation in the experiment. 
Stimuli
Three multistable stimuli (see Figure 1) were employed in this study: a modified Rubin's face/vase, a modified Schröder's staircase, and Lemmo's ambiguous cheetahs (a photograph of two cheetahs used with the permission of photographer Gerry Lemmo). All stimuli subtended a viewing angle of 3.3° × 3.3° and were presented on a Dell Monitor (Plug & Play Monitor on RADEON 7000, Microsoft Inc.; 85 Hz frame rate). To avoid any visual persistence or effects of afterimages, we repositioned the stimuli in space by 0.8° in both horizontal and vertical directions between presentations, resulting in five different spatial variants. Observers maintained fixation on a small (0.2°), centrally placed fixation cross that was visible throughout all stimulus presentations and ISIs. 
Figure 1a, 1b, 1c
 
The three multistable stimuli used in this study. (A) A modified Rubin's face/vase (edge version) elicits figure-ground reversals with minimal depth cues; that is, the figure can be seen as a centrally placed vase or as two profiles facing one another. (B) A modified Schröder's staircase (45° tilted) elicits depth-perspective reversals; that is, the stairs can be seen in two distinct three-dimensional configurations. (C) Lemmo's ambiguous cheetahs elicit object-belongingness reversals; that is, the cheetah in the front can appear to be looking to the right while the cheetah in the back is looking to the left or, alternatively, the cheetah in the front can appear to be looking to the left while the cheetah in the back is looking to the right.
Figure 1a, 1b, 1c
 
The three multistable stimuli used in this study. (A) A modified Rubin's face/vase (edge version) elicits figure-ground reversals with minimal depth cues; that is, the figure can be seen as a centrally placed vase or as two profiles facing one another. (B) A modified Schröder's staircase (45° tilted) elicits depth-perspective reversals; that is, the stairs can be seen in two distinct three-dimensional configurations. (C) Lemmo's ambiguous cheetahs elicit object-belongingness reversals; that is, the cheetah in the front can appear to be looking to the right while the cheetah in the back is looking to the left or, alternatively, the cheetah in the front can appear to be looking to the left while the cheetah in the back is looking to the right.
EEG recording
EEG scalp voltages were recorded using a Geodesic EEG System, NetAmps 200 (Electrical Geodesics Inc. [EGI], Eugene, OR). A 128-channel Hydrocel Geodesic Sensor Net (EGI) held each electrode in place. Each carbon-fiber electrode consists of a silver-chloride carbon fiber pellet, a lead wire, a gold-plated pin, and a potassium-chloride-soaked sponge. This electrode configuration effectively blocks out electrochemical noise and minimizes triboelectric noise. Signals were amplified via an AC-coupled, 128-channel high-input impedance amplifier (NetAmps 200, EGI). Amplified analog voltages, hardware band-pass-filtered at 0.1–100 Hz, were digitized at a 500-Hz sampling rate. All sensors were individually adjusted by the experimenter until the impedance of each was less than 40 kΩ. 
Procedure
Observers were comfortably seated 1.4 m from the computer monitor to maintain an approximately constant retinal image size. Prior to any recordings, observers viewed static versions of each of the three multistable stimuli. If an observer was initially unable to perceive both interpretations of any of the stimuli, the experimenter helped guide the observer by tracing the outline of the alternative percept on the computer monitor. 
Because reversal rates are known to increase during initial exposure to a novel multistable stimulus (Long & Toppino, 2004), practice trials were administered. During impedance measurements, two blocks of 30 practice trials for each of the three multistable stimuli were administered. The practice trials also served to familiarize the observers with the timing of stimulus presentation, the importance of fixating on the fixation cross, and the operation of the response box. 
Stimuli were flashed on the screen for 800 ms followed by a 400-ms ISI during which the participant would either (1) press a response button that indicates that their perception of the stimulus had reversed compared with the previous trial (reversal trials) or (2) wait for the next stimulus to appear without responding in the case of a nonreversal of the image (stability trials). Adopting the protocol used by Kornmeier and Bach (2004), we extended the ISI to 1,000 ms following trials that elicited a perceptual reversal. Figure 2 depicts the stimulus presentation protocol, and Movie 1 shows the stimulus and ISI durations as seen by observers. Because the occurrence of perceptual reversals of multistable figures has been shown to be influenced by intentional control (Kawabata, 1986; Leopold & Logothetis, 1999; Liebert & Burk, 1985; Long & Toppino, 2004; Pelton & Solley, 1968; Struber & Stadler, 1999; Toppino, 2003; van Ee, 2005; van Ee, van Dam, & Brouwer, 2005), observers were instructed not to voluntarily induce reversals but, rather, to simply view the stimuli and permit the reversals to occur naturally. To obviate any effects of eye movements (Georgiades & Harris, 1997; Long & Toppino, 2004), participants maintained their gaze on a centrally located fixation cross. All stimuli were viewed monocularly with the dominant eye to eliminate binocular depth cues that can occasionally lead to “flatter” appearances of these two-dimensional stimuli. 
Figure 2
 
To allow time locking of the EEG recording to the stimulus rather than to the response and to entrain the moment of perceptual reversals to an externally observable event, that is, stimulus onset, we continuously flashed stimuli on and off in 800-ms stimulus/400-ms ISI presentations (adopted from Kornmeier & Bach, 2004). Observers indicated reversals by a button press, which extended the ISI to 1,000 ms before the next 800–400 ms cycle resumed.
Figure 2
 
To allow time locking of the EEG recording to the stimulus rather than to the response and to entrain the moment of perceptual reversals to an externally observable event, that is, stimulus onset, we continuously flashed stimuli on and off in 800-ms stimulus/400-ms ISI presentations (adopted from Kornmeier & Bach, 2004). Observers indicated reversals by a button press, which extended the ISI to 1,000 ms before the next 800–400 ms cycle resumed.
 
Movie 1
 
One of the three multistable stimuli, Schröder's staircase, is shown in alternation with the ISI for the same durations (800/400 ms) as in the experiment. For all observers in the study, reversals only occurred at stimulus onset.
As in the Kornmeier and Bach studies (2004, 2005), the nature of the participant's task inevitably introduced a motor component to reversal trials and not to stability trials. This response-based difference should not be of concern for the current investigation because motor-related potentials occur later and are recorded from frontal and central scalp locations, whereas the components of interest in this study occur earlier and were recorded at posterior electrode sites. Furthermore, Kornmeier and Bach (2004) tested for this potential confound by changing the button-press task in half of the trials (respond to stability vs. respond to reversals) and found no differences in the response versus nonresponse ERP waveforms. An additional methodological consideration involves the possibility of trials in which neither of the two primary percepts was experienced; instead, a third, two-dimensional “flat” percept was seen. Although no participant reported seeing the images as such, if they did occur, it is unknown whether these “aberrant” trials would be categorized by the participants as reversal or stability trials. This would depend on whether the participant was responding to “general perceptual change” or “change to a specific percept.” Thus, to the extent that flat percepts occurred, they are most likely averaged into the mean variability in both types of trials. 
Each stimulus was presented 150 times per block of experimental trials, resulting in blocks lasting approximately 4 min. Short trial blocks with breaks after each helped to alleviate observer fatigue. An extended break was provided halfway through the experiment in which experimenters measured electrode impedance and rewet the sponges if necessary. Three blocks were run for each stimulus. EEG was recorded throughout the nine experimental blocks, which were counterbalanced across stimulus conditions. Each experimental session lasted approximately 40 min. 
ERP analyses
ERPs were time-locked to stimulus onset, baseline corrected at −100 to 0 ms, and low-pass-filtered at 25 Hz (following procedures of Kornmeier & Bach, 2004). Trials were discarded from analysis if they contained an eye blink or eye movement (EOG > 70 μV) or if more than 20% of electrode channels exceeded defined signal amplitudes (average amplitude, >200 μV, or transit amplitude, >100 μV). On average, 9% of trials per individual were rejected due to a combination of these factors. In addition, in order for a participant's data to be included in further analyses, at least 25 nondiscarded trials per condition were required. Averaged-referenced ERPs were computed for each channel by calculating the differences between each channel and a spherical interpolation of the average of all 128 channels. 
Recordings were sorted by condition and averaged for each individual observer, resulting in six ERP traces, that is, one reversal and one stability waveform (defined by button press vs. no button press) for each of the three stimulus types. Based on the findings of Kornmeier and Bach (2004, 2005), five posterior electrode sites were chosen for analysis: 75, 70, 83, 65, and 90 (equivalent to Oz, O1, O2, PO7, and PO8 respectively; see Luu & Ferree, 2005). Additionally, to investigate a possible role of the frontal attention network in figure reversal and a possible frontal counterpart to the posterior reversal negativity, we chose three anterior electrode sites for analysis: 11, 19, and 4 (Fz, F1, and F2 respectively). For statistical analyses, amplitudes were averaged within the two clusters of electrode sites, that is, the occipital group and the frontal group. 
To compare the current study's results to the findings of Kornmeier and Bach (2004, 2005), we computed difference waves (reversal minus stability) for each of the three stimulus types. The reversal negativity has been described as a broad negative difference that occurs between the occipital N1 (≈180 ms) and the P3 (≈400 ms). Kornmeier and Bach defined the reversal negativity by computing difference waves and identifying the largest excursion within this broad time window. For statistical purposes, defining a long-duration component such as this by analyzing the amplitude at the point of maximal difference may contribute to an artificial inflation of differences. A more conservative and arguably more accurate method for analyzing this component is to calculate the mean amplitude across the entire time window in each condition (for difference waves, this is essentially an area-under-the-curve measure). This latter approach was applied in the current study. 
The variability of the mean amplitude of the reversal negativity was estimated by a 2 × 3 repeated measures ANOVA with the factors perception (reverse or stable) and stimulus type (stairs, face/vase, or cheetahs). Three a priori t tests (one for each stimulus type; corrected for multiple comparisons) were then performed to analyze whether the mean amplitude for reversals was significantly more negative than the mean amplitude for stability. As no prior hypothesis predicted the accompanying frontal positivity, post hoc tests were employed to assess this difference. To investigate early attention effects on perceptual reversals, the peak amplitudes of the occipital P1, occipital N1, and frontal N1 were measured. The occipital P1 and frontal N1 amplitudes were measured relative to baseline, which was defined as the average amplitude from −100 to 0 ms. Occipital N1 peak amplitude was measured relative to the amplitude of the preceding peak, the P1, to account for the temporal overlap between these two components. Peak amplitudes were then analyzed with a 2 × 3 repeated measures ANOVA with the factors perception (reverse or stable) and stimulus type (stairs, face/vase, or cheetahs). Post hoc comparisons were made to determine whether peak amplitudes of reversal trials differed significantly from peak amplitudes of stability trials. 
Results
Behavioral results
Consistent with Kornmeier and Bach (2004, 2005), all observers reported seeing reversals only at stimulus onset and never within the 800-ms duration of stimulus presentation. Reversal rates varied across stimulus type; Schröder's staircase led to the most reversals (39%), followed by Rubin's face/vase (33%) and Lemmo's cheetahs (29%; see Figure 3). 
Figure 3
 
Mean number of reversals (out of 450 total stimulus presentations) across observers for the three types of multistable stimuli. Error bars represent ±1 SEM.
Figure 3
 
Mean number of reversals (out of 450 total stimulus presentations) across observers for the three types of multistable stimuli. Error bars represent ±1 SEM.
Reaction times to reversals were consistent across stimulus type and were never longer than the 800-ms stimulus presentation: Lemmo's cheetahs: M = 503 ms, SEM = 15.12; Rubin's face/vase: M = 501 ms, SEM = 14.81; Schröder's staircase: M = 506 ms, SEM = 12.90. Thus, all responses occurred within the time frame of the stimulus presentation. 
Electrophysiological results
Figure 4 shows the grand mean ERPs from each of the eight electrode sites analyzed for each of the three multistable stimuli. Examples of the four components of interest (the occipital P1, occipital N1, frontal N1, and reversal negativity) are indicated in upper-left panels of the figure. The occipital P1 analysis revealed a main effect of perception, F(1,20) = 5.047, p = .036 ( MSE = 3.039; η 2 = .202; power = .571), and no interaction between perception and stimulus type, F(2,20) = 0.771, p = .469. Post hoc comparisons (Tukey's HSD test) revealed significant differences between reversal and stability in the P1 component for the face/vase, q(20) = 2.97, p < .05, but not for the staircase, q(20) = 1.47, p > .05, or the cheetahs, q(20) = 1.06, p > .05. The occipital N1 analysis revealed a main effect of perception, F(1,20) = 14.96, p = .001 ( MSE = 0.679; η 2 = .428; power = .957), and no interaction between perception and stimulus type, F(2,20) = 1.25, p = .27. Post hoc comparisons revealed significant differences between reversal and stability in the N1 component for the face/vase, q(20) = 4.34, p < .01, and staircase, q(20) = 3.54, p < .05, but not for the cheetahs, q(20) = 1.60, p > .05. No differences between reversal and stability latencies for the P1 or N1 were found. Mean P1 (N1) peak latencies for reversals and stability were 121 ms (174 ms) for the cheetahs, 113 ms (179 ms) for the face/vase, and 115 ms (176 ms) for the stairs. 
Figure 4aa, 4ab, 4ac, 4ad, 4ae, 4af, 4ag, 4ah, 4ai, 4ba, 4bb, 4bc, 4bd, 4be, 4bf, 4bg, 4bh, 4bi, 4ca, 4cb, 4cc, 4cd, 4ce, 4cf, 4cg, 4ch, 4ci, 4d
 
Grand average ERPs ( N = 21) from eight electrode sites (marked on the sensor layout above) for each of the three multistable stimuli. All graphs show amplitude (in microvolts) plotted as a function of time (in milliseconds poststimulus). The occipital P1, N1, and reversal negativity can be seen in all parietal–occiptial (PO7, PO8) and occipital (O1, Oz, O2) electrode sites. The frontal N1 can be seen in all frontal (F1, Fz, F2) electrode sites. Note that the plots for the face/vase and stairs are equivalently scaled, whereas the cheetah plots are scaled differently due to the increased amplitude of the occipital P1.
Figure 4aa, 4ab, 4ac, 4ad, 4ae, 4af, 4ag, 4ah, 4ai, 4ba, 4bb, 4bc, 4bd, 4be, 4bf, 4bg, 4bh, 4bi, 4ca, 4cb, 4cc, 4cd, 4ce, 4cf, 4cg, 4ch, 4ci, 4d
 
Grand average ERPs ( N = 21) from eight electrode sites (marked on the sensor layout above) for each of the three multistable stimuli. All graphs show amplitude (in microvolts) plotted as a function of time (in milliseconds poststimulus). The occipital P1, N1, and reversal negativity can be seen in all parietal–occiptial (PO7, PO8) and occipital (O1, Oz, O2) electrode sites. The frontal N1 can be seen in all frontal (F1, Fz, F2) electrode sites. Note that the plots for the face/vase and stairs are equivalently scaled, whereas the cheetah plots are scaled differently due to the increased amplitude of the occipital P1.
Analysis of the frontal N1 component resulted in significant main effects of perception, F(1,20) = 10.18, p = .005 ( MSE = 0.562; η 2 = .337; power = .859), and no interaction between perception and stimulus type, F(2,20) = 0.309, p = .736. Post hoc comparisons revealed significant differences between reversal and stability for the face/vase, q(20) = 3.44, p < .05, but not for the stairs, q(20) = 2.06, p > .05, or for the cheetahs, q(20) = 2.31, p > .05. Frontal N1 latencies showed small differences for reversal versus stability, with reversal N1s consistently peaking slightly earlier than stability N1s. The mean frontal N1 latency for cheetah reversals and stability was 129 and 135 ms; for the face/vase and stairs stimuli, these were 118 and 121 ms and 118 and 120 ms, respectively. 
The reversal negativity was identified for two of the three stimulus types: the face/vase and the staircase. Figure 5 shows the difference waves for the five occipital/parietal sites. A mean amplitude measure across the time window of this component (200–400 ms) revealed significant main effects of perception, F(1,20) = 8.87; p = .007 ( MSE = 1.069; η 2 = .307; power = .808), and no interaction between perception and stimulus type, F(2,20) = 1.047; p = .360. Planned comparisons between reversal and stability mean amplitudes revealed significant differences for the face/vase, t(20) = 2.82, p < .015, and the stairs, t(20) = 3.00, p < .015, but not for the cheetahs, t(20) = 1.48, p > .015. Grand mean scalp topography plots show the differences between reversal and stability for the three multistable stimuli at the time of maximal difference ( Figure 6). The reversal negativity for the face/vase shows a slight lateralization to the left, whereas the staircase reversal negativity is bilaterally distributed. Interestingly, a possible counterpart to the posterior reversal negativity was also identified, that is, a frontal reversal positivity (see frontal sites in Figure 4). The fact that it occurs at the same time, but is of opposite polarity and scalp location, suggests that it may be generated by the same dipole source. 
Figure 5
 
Grand average ( N = 21) difference waves (reversal minus stability) are shown for the three multistable stimuli at the five occipital/parietal locations. Mean amplitude difference is plotted as a function of time (in milliseconds poststimulus). Solid lines represent the mean and dashed lines represent ± SEM. The reversal negativity (RN) was identified for the face/vase and staircase stimuli and can be observed in all five electrode channels. The RN begins at approximately 200 ms, peaks between 250 and 350 ms, and lasts until 400 ms. No significant RN was found for the cheetahs, although the PO8 and O2 channels suggest the presence of a small RN.
Figure 5
 
Grand average ( N = 21) difference waves (reversal minus stability) are shown for the three multistable stimuli at the five occipital/parietal locations. Mean amplitude difference is plotted as a function of time (in milliseconds poststimulus). Solid lines represent the mean and dashed lines represent ± SEM. The reversal negativity (RN) was identified for the face/vase and staircase stimuli and can be observed in all five electrode channels. The RN begins at approximately 200 ms, peaks between 250 and 350 ms, and lasts until 400 ms. No significant RN was found for the cheetahs, although the PO8 and O2 channels suggest the presence of a small RN.
Figure 6a, 6b, 6c, 6d
 
Topographic maps showing the reversal negativity for the face/vase and staircase stimuli. Peak latencies of the reversal negativity are shown: face/vase = 301 ms, staircase = 298 ms, cheetahs = 314 ms (a small nonsignificant negativity was found for the cheetahs). The scales are set to include the middle of the amplitude distribution (25–75%) and are, therefore, slightly different for each plot: face/vase = +2.42 to −1.68 μV; staircase = +2.98 to −2.22 μV; cheetahs = +4.92 to −3.82 μV.
Figure 6a, 6b, 6c, 6d
 
Topographic maps showing the reversal negativity for the face/vase and staircase stimuli. Peak latencies of the reversal negativity are shown: face/vase = 301 ms, staircase = 298 ms, cheetahs = 314 ms (a small nonsignificant negativity was found for the cheetahs). The scales are set to include the middle of the amplitude distribution (25–75%) and are, therefore, slightly different for each plot: face/vase = +2.42 to −1.68 μV; staircase = +2.98 to −2.22 μV; cheetahs = +4.92 to −3.82 μV.
Discussion
The focus of this study was to determine whether there are identifiable electrophysiological correlates associated with perceptual reversals of multistable stimuli. For each multistable stimulus, the physical input, that is, the retinal image, remained constant. Any changes in electrophysiological response could therefore be attributed to higher level perceptual or cognitive factors, rather than to factors dependent on early sensory-input properties, for example, retinal processing. Because the three types of multistable stimuli used in this study were physically different, ERP amplitude and latency differences were expected across stimulus type and were not of primary interest. The important differences for the purposes of this study were between the reversal and stability waveforms within each stimulus type. Our analyses sought to determine when these perceptual-based electrophysiological differences occurred and whether these differences were consistent across various types of multistable figures. 
Our results showed enhanced P1 (≈115 ms) amplitude for face/vase reversals and enhanced N1 (≈175 ms) amplitudes for both face/vase and staircase reversals. We also identified a subsequent broad negativity (200–400 ms) for perceptual reversals of the face/vase and staircase stimuli. Reasons why differences in the P1 and N1 components were not found for all three stimuli may be related to the small size of these early sensory components. More trials are typically necessary to identify differences in these components, as compared with the later and larger perceptual components (Luck, 2005). Future studies in our laboratory will seek to address this issue. 
No ERP differences were found for reversals of the cheetah stimulus. It is possible that participants were not able to identify reversals in the cheetah figure as distinctly as in the other two figures, leading to greater variability and diminished ERP effects. In a postexperiment questionnaire, participants often reported this to be the case, and the cheetah stimulus led to the smallest percentage of reversal trials. An inability of the visual system to successfully alter spatial and/or selective attention (discussed below) to perceive the two distinct configurations may be responsible for the lack of ERP differences with this stimulus. A possible contributing factor for the lack of ERP effects in response to the cheetah stimulus was that the original image was cropped by the researchers to maintain a centralized presentation and allow fixation on an area of the image critical to perceptual reversals. The cropped image contains fewer figure-belongingness cues than the original image, which may have led to fewer and less salient reversals. The original image is shown in Figure 7 for comparison. 
Figure 7
 
The original photograph of Lemmo's ambiguous cheetahs (Gerry Lemmo, 2006). The photograph was cropped for the experiment to allow a central presentation and a fixation cross near the necks of the cheetahs. See Figure 1C for the modified version used in the experiment.
Figure 7
 
The original photograph of Lemmo's ambiguous cheetahs (Gerry Lemmo, 2006). The photograph was cropped for the experiment to allow a central presentation and a fixation cross near the necks of the cheetahs. See Figure 1C for the modified version used in the experiment.
The timing and scalp topography of the ERP differences found for the face/vase and reversible staircase stimuli suggest a close relationship between changes in multistable perception and changes in spatial and selective attention. The following discussion will focus on evaluating these relationships, as well as incorporating these ERP findings into models and theories of multistable perception. 
Early attention effects
Numerous studies have demonstrated that the occipital P1 and N1 amplitudes are dependent on spatial attention (e.g., Clark & Hillyard, 1996; Hillyard & Anllo-Vento, 1998; Hillyard et al., 1998; Luck et al., 1994; Mangun, 1995). For example, Luck et al. (1994) employed cueing paradigms to manipulate participants' attention to particular spatial locations. P1 and N1 amplitudes for attended versus unattended locations were compared with amplitudes for neutral trials, in which attention was more broadly focused. Unattended stimuli led to decreases in P1 amplitude, whereas attended stimuli led to increases in N1 amplitude. Interestingly, in the current study, the reversal versus stability waveforms closely resemble Luck et al.'s attended versus unattended waveforms, respectively. In stable trials, spatial attention can be assumed to be sustained, whereas in reversal trials, spatial attention has arguably changed. This is not necessarily equivalent to attending versus not attending but may instead reflect a difference between sustaining versus shifting spatial attention. Future studies should address the possibility that P1 and N1 amplitudes may be enhanced by redirecting versus sustaining spatial attention as opposed to attending versus not attending to a particular location. It is also worth noting that although the reversal and stability P1 and N1 amplitudes closely resemble the attended versus unattended P1 and N1 amplitudes from other studies, these differences could be caused by distinct underlying components. Further research is necessary to clarify this relationship. 
The difficulty in interpreting these ERP signatures of spatial attention under the current experimental paradigm is due to the fact that reversals in both directions were averaged together for analysis. For example, the behavioral task was such that reversals from perceiving the vase to perceiving the faces were treated the same as (and averaged together with) reversals from perceiving the faces to perceiving the vase. Averaging reversals in both directions may obscure some of the effects of spatial attention alteration. Studies that are currently underway in our laboratory are attempting to separate the two types of directional reversals by modifying the behavioral task of the observers. It is also possible that repositioning the stimuli in space between flashes had an unintended effect of externally inducing changes in spatial attention. 
Although it is difficult to localize the neural generators of ERP components, recent attempts have been made to identify the neurophysiological sources of the P1 and N1 components. Such approaches include combining ERP and fMRI techniques in a single study (Di Russo, Martinez, Sereno, Pitzalis, & Hillyard, 2001; Martinez et al., 2001), combining ERP and PET measurements (Woldorff et al., 1997), combining ERP and MEG measurements (Hopf, Vogel, Woodman, Heinze, & Luck, 2002), analyzing scalp current density mappings (Gomez Gonzalez, Clark, Fan, Luck, & Hillyard, 1994; Johannes, Munte, Heinze, & Mangun, 1995), and modeling spatiotemporal dipoles (Anllo-Vento, Luck, & Hillyard, 1998; Clark & Hillyard, 1996; Di Russo et al., 2001; Gomez Gonzalez et al., 1994). All of these approaches provide converging evidence that the generator of the occipital P1 is located in extrastriate cortex, either in dorsal or ventral regions, depending on the particular techniques used. The N1 generators, on the other hand, have been localized to the ventral pathway, in particular the occipitotemporal cortex, although one study (Martinez et al., 2001) suggests that negativities ranging from 160 to 260 ms may reflect delayed V1 activity that is influenced by reentrant feedback from higher visual areas. Enhancement of P1 and/or N1 amplitudes, therefore, may reflect an initial increase in activity in extrastriate cortex and a subsequent increase in occipitotemporal cortex activation. In the context of the current study, early processing of the multistable stimuli may be equivalent for reversal and stability trials in V1 but enhanced or suppressed in extrastriate cortex depending on spatial attention factors (as indicated by P1 differences). Following this initial change in extrastriate activity, cortical regions in the ventral pathway are then affected by the allocation of spatial attention and show differences in activation for reversal/stability trials (i.e., N1 enhancement for reversal trials). 
Reversal negativity
Kornmeier and Bach (2004) coined the term reversal negativity to describe the broad negative difference in the ERP waves (from 200 to 400 ms poststimulus) that occurs for reversal trials compared with stability trials. They argued that this component was distinct from the SN component. The SN has been described as a selective attention-dependent ERP component (Hillyard & Anllo-Vento, 1998). If observers are instructed to pay attention to a certain stimulus feature such as color, orientation, or shape, a broad negativity (beginning at 180–200 ms and persisting for another 200 ms) can be identified when comparing trials in which the attended feature appears to trials in which a nonattended feature appears (Anllo-Vento & Hillyard, 1996; Martin-Loeches, Hinojosa, & Rubia, 1999; Michie et al., 1999; Smid, Jakob, & Heinze, 1997, 1999; Valdes-Sosa, Bobes, Rodriguez, & Pinilla, 1998). In some of these studies, participants were instructed to respond when the to-be-attended feature appeared. Kornmeier and Bach suggested that their reversal negativity was distinct from the SN because the reversal negativity was identified both in trials in which the “target” (i.e., what the observers responded to) was perceptual reversal and in trials in which the target was perceptual stability. Anllo-Vento and Hillyard (1996), however, designed a clever experiment in which attended location–feature combination trials were separable from target trials. They then compared target to nontarget ERPs and showed that the SN is not dependent on target selection and that the earliest target-related ERP differences occur later than 325 ms poststimulus. 
Because of the striking similarity (in polarity, scalp topography, and latency) between the SN and the reversal negativity, as well as the assumed independence of the SN and target-selection components, a more parsimonious explanation would suggest a common underlying mechanism. It may be the case that the reversal negativity (or SN) reflects a change in selective attention. To disambiguate a multistable figure, it can be assumed that selective attention is required regardless of whether the figure appeared the same or different on the previous trial. The reversal negativity (or SN) identified in the current study must, therefore, reflect a change in selective attention, that is, attention to certain features of the multistable figure in one trial followed by attention to different features in the subsequent trial. This interpretation supports the notion that early spatial attention alteration (indexed by P1 and N1 amplitudes) modulates later feature selection (indexed by the reversal negativity or SN), which determines how the multistable image is perceived. It is likely that the recognition of reversals is dependent on these early attentional changes and occurs during or after the reversal negativity (or SN). 
Visual attention/environment exploration theory
Leopold and Logothetis (1999) proposed a theory of multistable perception that is largely nonsensory in origin. Based on perceptual research in a variety of contexts, they support a view in which perceptual reversals are the necessary consequences of a generalized high-level “exploratory” mechanism that directs spatial and selective attention in a way that forces lower level perceptual systems to periodically refresh. This mechanism is described as being neither purely sensory nor purely motor but, rather, as a mechanism in which the ultimate goal is to “use” and “act upon” environmental information. By continually reorganizing and refreshing perceptual processing, accurate interpretation of visual input is improved. In normal everyday situations, this central mechanism (most likely a frontoparietal network) works with eye movement centers (the frontal eye fields) to mediate a continuous exploration of the visual scene. Visual attention is most easily controlled through eye movements, and objects of interest are usually unambiguous. In multistable perception experiments, covert attention (without voluntary eye movements) may be altered by this central exploratory mechanism (i.e., P1 and N1 differences; Figure 4), and due to the ambiguity of the stimuli, reversals in perceptual interpretation consistently occur. Although these are all assumed to work largely in an unconscious, automatic fashion, it is possible that voluntary control over multistable perceptual reversals works through this same mechanism. 
Recent fMRI studies, for example, have found evidence that intentional reversals of multistable stimuli are mediated through attentional mechanisms (Slotnick & Yantis, 2005). A comparison of brain activity during Necker cube reversals versus simple left–right attentional shifts revealed similar areas of neural activation. Transient increases of activity in the superior parietal lobule and intraparietal sulcus occurred for both voluntary shifts in spatial attention and voluntary reversals of the Necker cube and influenced activity in early visual areas (Slotnick & Yantis, 2005). It is possible that when observers attempt to control perceptual reversals of multistable stimuli, they are tapping into this normally automatic exploratory “perceptual-refresh” mechanism to change spatial attention and reorganize perceptual interpretation. 
Numerous studies have shown that perceptual reversals can be controlled voluntarily (e.g., Kawabata, 1986; Leopold & Logothetis, 1999; Liebert & Burk, 1985; Long & Toppino, 2004; Pelton & Solley, 1968; Struber & Stadler, 1999; Toppino, 2003; van Ee, 2005; van Ee et al., 2005; Windmann et al., 2006), although it is always the case that involuntary reversals continue to occur as well. Slotnick and Yantis (2005) argue that these unintentional reversals are evidence that attention cannot account for all perceptual shifts in multistable perception and that perceptual fatigue may also play a role. If Leopold and Logothetis's (1999) theory is correct, however, there is no need to rely on the notion of perceptual fatigue at all. When participants in Slotnick and Yantis's experiment experienced unintentional shifts in perception, the unconscious, automatic exploratory mechanism may have been responsible. In many trials, participants were able to control the attentional shifts invoked by this mechanism, but when the task's demands conflicted with the system's preexisting strategy of continuously refreshing perception, reversals occurred largely on their own. Experiments in our laboratory have attempted to measure the ERP correlates of intentional control over perceptual reversals of multistable stimuli by comparing intentional to unintentional reversals. Results suggest that amplitudes of the occipital N1 are enhanced for intentional versus unintentional reversals, revealing possible influences of the frontoparietal attention network (Pitts, in press). 
Low-level/high-level theories
The most recent debate involving multistable perception has focused on the so-called bottom–up/top–down dichotomy (see Long & Toppino, 2004, for a complete review). This terminology may be overly simplistic at first glance; however, if bottom–up is taken to mean “feedforward processing” and top–down is taken to mean “feedback processing,” this distinction may be of theoretical as well as neurophysiological significance. The bottom–up approach emphasizes the critical role of neural adaptation (or satiation) in accounting for the alterations of multistable percepts (Cohen, 1959; Kohler, 1940). In this view, sometimes referred to as the “neural-channel model,” perceptual reversals of multistable figures are caused by disparate fatigue–recovery cycles of the neural circuits underlying each of the percepts (Long & Toppino, 2004; Orbach et al., 1963; Toppino & Long, 1987). This theory describes figure reversals as passive, automatic, sensory-driven events that are independent of cognition. The bottom–up thus refers to the hypothesized direction of information flow, from the retinal stimulus, to early stages of sensory processing, through various intermediate stages, and finally to perceptual and cognitive awareness. In the opposing top–down view, an emphasis is placed on the active role of the observer. Cognitive processes such as memory, attention, and decision making are brought to the forefront in top–down theories (e.g., Georgiades & Harris, 1997; Horlitz & O'Leary, 1993; Kawabata, 1986; Leopold, 2003; Meng & Tong, 2004; Pelton & Solley, 1968; Rock et al., 1994; Shulman, 1993; Struber & Stadler, 1999; Toppino, 2003). In contrast to the bottom–up account, top–down theories emphasize the flow of information from high-level nonsensory systems to lower level perceptual processes. 
As a clear example of the two competing theories, consider the following: It has been shown that preexposure to unambiguous versions of multistable figures can influence subsequent perceptual interpretation of ambiguous versions (Long & Toppino, 2004). For example, if an unambiguous cube is flashed briefly, followed by an ambiguous Necker cube, observers tend to report perceiving the Necker cube in the same configuration as the unambiguous prime (Long, Toppino, & Mondin, 1992). This evidence has been used to support the top–down theory; that is, previous knowledge or memory has been activated by the unambiguous prime and then influences the perception of the ambiguous target. However, if observers are asked to stare at an unambiguous version of the cube for an extended period (e.g., >30 s) and are then presented with an ambiguous Necker cube, they most often report perceiving the opposite interpretation of the cube (Long et al., 1992). This adaptation effect has been interpreted as support for a bottom–up or fatigue-based account of figure reversal. Thus, simply changing the duration of preexposure to an unambiguous version of a multistable figure allows one to support either (or neither) of the opposing theories. 
Kornmeier and Bach (2004, 2005) maintain that early ERP differences (120 and 250 ms) in reversal versus stability waveforms support a bottom–up theory of multistable perception. Although one can argue indefinitely (and quite hopelessly) about what “early” versus “late” means as far as the timing of visual processing, we do not agree that the timing of these ERP differences can support either bottom–up or top–down theories directly. An experimental manipulation involving one or both of the possible directions of influence is required to support one of the two opposing theories (such as that in Long et al., 1992; Slotnick & Yantis, 2005; Windmann et al., 2006). Additionally, if the spatial attention theory holds true, the top–down influences mediated by spatial attention alteration are likely to occur very early in visual processing. In this theory, higher level networks are focused on exploring the visual environment and constantly redirect attention to ensure consistent, accurate perceptual interpretation. Whether these networks function completely independently of an observer's intention or are slightly modified and controlled by intentional demands, they are nevertheless influencing perception at an early stage. Previous fatigue-based accounts could be reconsidered as evidence of sustained spatial attention that when given a chance to explore, it does so immediately (i.e., in the adaptation example described above). Instead of fatiguing lower level sensory or perceptual neurons, adaptation may work to provoke spatial attention alterations following continuous attention to a certain region or perceptual configuration of the figure. Clearly, a strict bottom–up/top–down explanation of perceptual reversals is overly simplistic. If further experimentation continues to support the nonsensory (exploratory/perceptual-refresh) theory of Leopold and Logothetis (1999), a new model of multistable perception will be required. 
Conclusions
P1 and N1 amplitude enhancements for perceptual reversal compared with perceptual stability were found for Rubin's face/vase, and an N1 amplitude enhancement was found for Schröder's staircase. These ERP changes most likely reflect changes in visual spatial attention. A later broad negativity, the reversal negativity was found for perceptual reversals of the same two multistable stimuli. This negativity closely resembles the SN of previous studies and may reflect changes in selective attention that are critical for figure reversals. These findings do not support a strict low-level or high-level theory of multistable perception but, instead, suggest a critical role for perceptual exploration mediated by visual attention mechanisms. 
Acknowledgments
This research was supported in part by an NSF Research Experience for Undergraduates Grant. 
The authors wish to thank William J. Gavin for his contributions to this project and the Journal of Vision reviewers for their helpful comments on an earlier draft. 
Commercial relationships: None. 
Corresponding author: Michael A. Pitts. 
Address: Department of Psychology, Colorado State University, Fort Collins, CO, 80523. 
References
Anllo-Vento, L. Hillyard, S. A. (1996). Selective attention to the color and direction of moving stimuli: Electrophysiological correlates of hierarchical feature selection. Perception & Psychophysics, 58, 191–206. [PubMed] [CrossRef] [PubMed]
Anllo-Vento, L. Luck, S. J. Hillyard, S. A. (1998). Spatio-temporal dynamics of attention to color: Evidence from human electrophysiology. Human Brain Mapping, 6, 216–238. [PubMed] [CrossRef] [PubMed]
Basar-Eroglu, C. Struber, D. Stadler, M. Kruse, P. Basar, E. (1993). Multistable visual perception induces a slow positive EEG wave. International Journal of Neuroscience, 73, 139–151. [PubMed] [CrossRef] [PubMed]
Clark, V. Hillyard, S. (1996). Spatial selective attention affects early extrastriate but not striate components of the visual evoked potential. Journal of Cognitive Neuroscience, 8, 387–402. [CrossRef] [PubMed]
Cohen, L. (1959). Rate of apparent change of a Necker cube as a function of prior stimulation. American Journal of Psychology, 72, 327–344. [CrossRef]
Di Russo, F. Martinez, A. Sereno, M. I. Pitzalis, S. Hillyard, S. A. (2001). Cortical sources of the early components of the visual evoked potential. Human Brain Mapping, 15, 95–111. [PubMed] [CrossRef]
Foxe, J. J. Simpson, G. V. (2002). Flow of activation from V1 to frontal cortex in humans: A framework for defining “early” visual processing. Experimental Brain Research, 142, 139–150. [PubMed] [CrossRef] [PubMed]
Georgiades, M. Harris, J. (1997). Biasing effects in ambiguous figures: Removal or fixation of critical features can affect perception. Visual Cognition, 4, 383–408. [CrossRef]
Gomez Gonzalez, C. M. Clark, V. P. Fan, S. Luck, S. J. Hillyard, S. A. (1994). Sources of attention-sensitive visual event-related potentials. Brain Topography, 7, 41–51. [PubMed] [CrossRef] [PubMed]
Hillyard, S. A. Anllo-Vento, L. (1998). Event-related brain potentials in the study of visual selective attention. Proceedings of the National Academy of Sciences of the United States of America, 95, 781–787. [PubMed] [Article] [CrossRef] [PubMed]
Hillyard, S. A. Vogel, E. K. Luck, S. J. (1998). Sensory gain control (amplification as a mechanism of selective attention: Electrophysiological and neuroimaging evidence. Philosophical Transactions of the Royal Society B: Biological Sciences, 353, 1257–1270. [PubMed] [Article] [CrossRef]
Hopf, J. M. Vogel, E. Woodman, G. Heinze, H. J. Luck, S. J. (2002). Localizing visual discrimination processes in time and space. Journal of Neurophysiology, 88, 2088–2095. [PubMed] [Article] [PubMed]
Horlitz, K. L. O'Leary, A. (1993). Satiation or availability Effects of attention, memory, and imagery on the perception of ambiguous figures. Perception & Psychophysics, 53, 668–681. [PubMed] [CrossRef] [PubMed]
Inui, T. Tanaka, S. Okada, T. Nishizawa, S. Katayama, M. Konishi, J. (2000). Neural substrates for depth perception of the Necker cube; a functional magnetic resonance imaging study in human subjects. Neuroscience Letters, 282, 145–148. [PubMed] [CrossRef] [PubMed]
Isoglu-Alkac, U. Basar-Eroglu, C. Ademoglu, A. Demiralp, T. Miener, M. Stadler, M. (1998). Analysis of the electroencephalographic activity during the Necker cube reversals by means of the wavelet transform. Biological Cybernetics, 79, 437–442. [PubMed] [CrossRef] [PubMed]
Johannes, S. Munte, T. F. Heinze, H. J. Mangun, G. R. (1995). Luminance and spatial attention effects on early visual processing. Cognitive Brain Research, 2, 189–205. [PubMed] [CrossRef] [PubMed]
Kawabata, N. (1986). Attention and depth perception. Perception, 15, 563–572. [PubMed] [CrossRef] [PubMed]
Kleinschmidt, A. Buchel, C. Zeki, S. Frackowiak, R. S. (1998). Human brain activity during spontaneously reversing perception of ambiguous figures. Proceedings of the Royal Society B: Biological Sciences, 265, 2427–2433. [PubMed] [Article] [CrossRef]
Kohler, W. (1940). Dynamics in psychology. New York: Liverlight.
Kornmeier, J. Bach, M. (2004). Early neural activity in Necker-cube reversal: Evidence for low-level processing of a gestalt phenomenon. Psychophysiology, 41, 1–8. [PubMed] [CrossRef] [PubMed]
Kornmeier, J. Bach, M. (2005). The Necker cube—an ambiguous figure disambiguated in early visual processing. Vision Research, 45, 955–960. [PubMed] [CrossRef] [PubMed]
Lemmo, G. (2006). National Geographic World. [.
Leopold, D. A. (2003). Visual perception: Shaping what we see. Current Biology, 13, R10–R12. [PubMed] [Article] [CrossRef] [PubMed]
Leopold, D. A. Logothetis, N. K. (1996). Activity changes in early visual cortex reflect monkeys' percepts during binocular rivalry. Nature, 379, 549–553. [PubMed] [CrossRef] [PubMed]
Leopold, D. A. Logothetis, N. K. (1999). Multistable phenomena: Changing views in perception. Trends in Cognitive Sciences, 3, 254–264. [PubMed] [CrossRef] [PubMed]
Leopold, D. A. Wilke, M. Maier, A. Logothetis, N. K. (2002). Stable perceptions of visually ambiguous patterns. Nature Neuroscience, 5, 605–609. [PubMed] [CrossRef] [PubMed]
Liebert, R. M. Burk, B. (1985). Voluntary control of reversible figures. Perceptual and Motor Skills, 61, 1307–1310. [PubMed] [CrossRef] [PubMed]
Logothetis, N. K. Schall, J. D. (1989). Neuronal correlates of subjective visual perception. Science, 245, 761–763. [PubMed] [CrossRef] [PubMed]
Long, G. M. Toppino, T. C. (2004). Enduring interest in perceptual ambiguity: Alternating views of reversible figures. Psychological Bulletin, 130, 748–768. [PubMed] [CrossRef] [PubMed]
Long, G. M. Toppino, T. C. Mondin, G. W. (1992). Prime time: Fatigue and set effects in the perception of reversible figures. Perception & Psychophysics, 52, 609–616. [PubMed] [CrossRef] [PubMed]
Luck, S. (2005). An introduction to the event-related potential technique. Cambridge, MA: MIT Press.
Luck, S. J. Hillyard, S. A. Mouloua, M. Woldorff, M. G. Clark, V. P. Hawkins, H. L. (1994). Effects of spatial cuing on luminance detectability: Psychophysical and electrophysiological evidence for early selection. Journal of Experimental Psychology: Human Perception and Performance, 20, 887–904. [PubMed] [CrossRef] [PubMed]
Luu, P. Ferree, T. (2005). Electrical Geodesics, Inc., Technical Note. [.
Mangun, G. R. (1995). Neural mechanisms of visual selective attention. Psychophysiology, 32, 4–18. [PubMed] [CrossRef] [PubMed]
Martinez, A. DiRusso, F. Anllo-Vento, L. Sereno, M. I. Buxton, R. B. Hillyard, S. A. (2001). Putting spatial attention on the map: Timing and localization of stimulus selection processes in striate and extrastriate visual areas. Vision Research, 41, 1437–1457. [PubMed] [CrossRef] [PubMed]
Martin-Loeches, M. Hinojosa, J. A. Rubia, F. J. (1999). Insights from event-related potentials into the temporal and hierarchical organization of the ventral and dorsal streams of the visual system in selective attention. Psychophysiology, 36, 721–736. [PubMed] [CrossRef] [PubMed]
Meenan, J. P. Miller, L. A. (1994). Perceptual flexibility after frontal or temporal lobectomy. Neuropsychologia, 32, 1145–1149. [PubMed] [CrossRef] [PubMed]
Meng, M. Tong, F. (2004). Can attention selectively bias bistable perception Differences between binocular rivalry and ambiguous figures. Journal of Vision, 4, (7), 539–551, http://journalofvision.org/4/7/2/, doi:10.1167/4.7.2. [PubMed] [Article] [CrossRef] [PubMed]
Michie, P. T. Karayanidis, F. Smith, G. L. Barrett, N. A. Large, M. M. O'Sullivan, B. T. (1999). An exploration of varieties of visual attention: ERP findings. Cognitive Brain Research, 7, 419–450. [PubMed] [CrossRef] [PubMed]
O'Donnell, B. F. Hendler, T. Squires, N. K. (1988). Visual evoked potentials to illusory reversals of the Necker cube. Psychophysiology, 25, 137–143. [PubMed] [CrossRef] [PubMed]
Orbach, J. Ehrlich, D. Heath, H. A. (1963). Reversibility of the Necker cube: I An examination of the concept of “satiation of orientation”; Perceptual and Motor Skills, 17, 439–458. [PubMed] [CrossRef] [PubMed]
Pelton, L. H. Solley, C. M. (1968). Acceleration of reversals of a Necker cube. American Journal of Psychology, 81, 585–588. [PubMed] [CrossRef] [PubMed]
Picton, T. (1992). The P300 wave of the human event-related potential. Journal of Clinical Neurophysiology, 9, 456–479. [PubMed] [CrossRef] [PubMed]
Pitts, M. (in press). Top–down influences on bistable perception revealed by event-related potentials.
Ricci, C. Blundo, C. (1990). Perception of ambiguous figures after focal brain lesions. Neuropsychologia, 28, 1163–1173. [PubMed] [CrossRef] [PubMed]
Rock, I. Hall, S. Davis, J. (1994). Why do ambiguous figures reverse? Acta Psychologica, 87, 33–59. [PubMed] [CrossRef] [PubMed]
Sheinberg, D. L. Logothetis, N. K. (1997). The role of temporal cortical areas in perceptual organization. Proceedings of the National Academy of Sciences of the United States of America, 94, 3408–3413. [PubMed] [Article] [CrossRef] [PubMed]
Shulman, G. (1993). Attentional effects on Necker cube adaptation. Canadian Journal of Experimental Psychology, 47, 540–547. [CrossRef]
Slotnick, S. D. Yantis, S. (2005). Common neural substrates for the control and effects of visual attention and perceptual bistability. Cognitive Brain Research, 24, 97–108. [PubMed] [CrossRef] [PubMed]
Smid, H. G. Jakob, A. Heinze, H. J. (1997). The organization of multidimensional selection on the basis of color and shape: An event-related brain potential study. Perception & Psychophysics, 59, 693–713. [PubMed] [CrossRef] [PubMed]
Smid, H. G. Jakob, A. Heinze, H. J. (1999). An event-related brain potential study of visual selective attention to conjunctions of color and shape. Psychophysiology, 36, 264–279. [PubMed] [CrossRef] [PubMed]
Struber, D. Stadler, M. (1999). Differences in top–down influences on the reversal rate of different categories of reversible figures. Perception, 28, 1185–1196. [PubMed] [CrossRef] [PubMed]
Toppino, T. C. (2003). Reversible-figure perception: Mechanisms of intentional control. Perception & Psychophysics, 65, 1285–1295. [PubMed] [CrossRef] [PubMed]
Toppino, T. C. Long, G. M. (1987). Selective adaptation with reversible figures: Don't change that channel. Perception & Psychophysics, 42, 37–48. [PubMed] [CrossRef] [PubMed]
Valdes-Sosa, M. Bobes, M. A. Rodriguez, V. Pinilla, T. (1998). Switching attention without shifting the spotlight object-based attentional modulation of brain potentials. Journal of Cognitive Neuroscience, 10, 137–151. [PubMed] [CrossRef] [PubMed]
van Ee, R. (2005). Dynamics of perceptual bi-stability for stereoscopic slant rivalry and a comparison with grating, house-face, and Necker cube rivalry. Vision Research, 45, 29–40. [PubMed] [CrossRef] [PubMed]
van Ee, R. van Dam, L. C. Brouwer, G. J. (2005). Voluntary control and the dynamics of perceptual bi-stability. Vision Research, 45, 41–55. [PubMed] [CrossRef] [PubMed]
Windmann, S. Wehrmann, M. Calabrese, P. Gunturkun, O. (2006). Role of the prefrontal cortex in attentional control over bistable vision. Journal of Cognitive Neuroscience, 18, 456–471. [PubMed] [CrossRef] [PubMed]
Woldorff, M. Fox, P. Matzke, M. Lancaster, J. Veeraswamy, S. Zamarripa, F. (1997). Retinotopic organization of early visual spatial attention effects as revealed by PET and ERPs. Human Brain Mapping, 5, 280–286. [CrossRef] [PubMed]
Figure 1a, 1b, 1c
 
The three multistable stimuli used in this study. (A) A modified Rubin's face/vase (edge version) elicits figure-ground reversals with minimal depth cues; that is, the figure can be seen as a centrally placed vase or as two profiles facing one another. (B) A modified Schröder's staircase (45° tilted) elicits depth-perspective reversals; that is, the stairs can be seen in two distinct three-dimensional configurations. (C) Lemmo's ambiguous cheetahs elicit object-belongingness reversals; that is, the cheetah in the front can appear to be looking to the right while the cheetah in the back is looking to the left or, alternatively, the cheetah in the front can appear to be looking to the left while the cheetah in the back is looking to the right.
Figure 1a, 1b, 1c
 
The three multistable stimuli used in this study. (A) A modified Rubin's face/vase (edge version) elicits figure-ground reversals with minimal depth cues; that is, the figure can be seen as a centrally placed vase or as two profiles facing one another. (B) A modified Schröder's staircase (45° tilted) elicits depth-perspective reversals; that is, the stairs can be seen in two distinct three-dimensional configurations. (C) Lemmo's ambiguous cheetahs elicit object-belongingness reversals; that is, the cheetah in the front can appear to be looking to the right while the cheetah in the back is looking to the left or, alternatively, the cheetah in the front can appear to be looking to the left while the cheetah in the back is looking to the right.
Figure 2
 
To allow time locking of the EEG recording to the stimulus rather than to the response and to entrain the moment of perceptual reversals to an externally observable event, that is, stimulus onset, we continuously flashed stimuli on and off in 800-ms stimulus/400-ms ISI presentations (adopted from Kornmeier & Bach, 2004). Observers indicated reversals by a button press, which extended the ISI to 1,000 ms before the next 800–400 ms cycle resumed.
Figure 2
 
To allow time locking of the EEG recording to the stimulus rather than to the response and to entrain the moment of perceptual reversals to an externally observable event, that is, stimulus onset, we continuously flashed stimuli on and off in 800-ms stimulus/400-ms ISI presentations (adopted from Kornmeier & Bach, 2004). Observers indicated reversals by a button press, which extended the ISI to 1,000 ms before the next 800–400 ms cycle resumed.
Figure 3
 
Mean number of reversals (out of 450 total stimulus presentations) across observers for the three types of multistable stimuli. Error bars represent ±1 SEM.
Figure 3
 
Mean number of reversals (out of 450 total stimulus presentations) across observers for the three types of multistable stimuli. Error bars represent ±1 SEM.
Figure 4aa, 4ab, 4ac, 4ad, 4ae, 4af, 4ag, 4ah, 4ai, 4ba, 4bb, 4bc, 4bd, 4be, 4bf, 4bg, 4bh, 4bi, 4ca, 4cb, 4cc, 4cd, 4ce, 4cf, 4cg, 4ch, 4ci, 4d
 
Grand average ERPs ( N = 21) from eight electrode sites (marked on the sensor layout above) for each of the three multistable stimuli. All graphs show amplitude (in microvolts) plotted as a function of time (in milliseconds poststimulus). The occipital P1, N1, and reversal negativity can be seen in all parietal–occiptial (PO7, PO8) and occipital (O1, Oz, O2) electrode sites. The frontal N1 can be seen in all frontal (F1, Fz, F2) electrode sites. Note that the plots for the face/vase and stairs are equivalently scaled, whereas the cheetah plots are scaled differently due to the increased amplitude of the occipital P1.
Figure 4aa, 4ab, 4ac, 4ad, 4ae, 4af, 4ag, 4ah, 4ai, 4ba, 4bb, 4bc, 4bd, 4be, 4bf, 4bg, 4bh, 4bi, 4ca, 4cb, 4cc, 4cd, 4ce, 4cf, 4cg, 4ch, 4ci, 4d
 
Grand average ERPs ( N = 21) from eight electrode sites (marked on the sensor layout above) for each of the three multistable stimuli. All graphs show amplitude (in microvolts) plotted as a function of time (in milliseconds poststimulus). The occipital P1, N1, and reversal negativity can be seen in all parietal–occiptial (PO7, PO8) and occipital (O1, Oz, O2) electrode sites. The frontal N1 can be seen in all frontal (F1, Fz, F2) electrode sites. Note that the plots for the face/vase and stairs are equivalently scaled, whereas the cheetah plots are scaled differently due to the increased amplitude of the occipital P1.
Figure 5
 
Grand average ( N = 21) difference waves (reversal minus stability) are shown for the three multistable stimuli at the five occipital/parietal locations. Mean amplitude difference is plotted as a function of time (in milliseconds poststimulus). Solid lines represent the mean and dashed lines represent ± SEM. The reversal negativity (RN) was identified for the face/vase and staircase stimuli and can be observed in all five electrode channels. The RN begins at approximately 200 ms, peaks between 250 and 350 ms, and lasts until 400 ms. No significant RN was found for the cheetahs, although the PO8 and O2 channels suggest the presence of a small RN.
Figure 5
 
Grand average ( N = 21) difference waves (reversal minus stability) are shown for the three multistable stimuli at the five occipital/parietal locations. Mean amplitude difference is plotted as a function of time (in milliseconds poststimulus). Solid lines represent the mean and dashed lines represent ± SEM. The reversal negativity (RN) was identified for the face/vase and staircase stimuli and can be observed in all five electrode channels. The RN begins at approximately 200 ms, peaks between 250 and 350 ms, and lasts until 400 ms. No significant RN was found for the cheetahs, although the PO8 and O2 channels suggest the presence of a small RN.
Figure 6a, 6b, 6c, 6d
 
Topographic maps showing the reversal negativity for the face/vase and staircase stimuli. Peak latencies of the reversal negativity are shown: face/vase = 301 ms, staircase = 298 ms, cheetahs = 314 ms (a small nonsignificant negativity was found for the cheetahs). The scales are set to include the middle of the amplitude distribution (25–75%) and are, therefore, slightly different for each plot: face/vase = +2.42 to −1.68 μV; staircase = +2.98 to −2.22 μV; cheetahs = +4.92 to −3.82 μV.
Figure 6a, 6b, 6c, 6d
 
Topographic maps showing the reversal negativity for the face/vase and staircase stimuli. Peak latencies of the reversal negativity are shown: face/vase = 301 ms, staircase = 298 ms, cheetahs = 314 ms (a small nonsignificant negativity was found for the cheetahs). The scales are set to include the middle of the amplitude distribution (25–75%) and are, therefore, slightly different for each plot: face/vase = +2.42 to −1.68 μV; staircase = +2.98 to −2.22 μV; cheetahs = +4.92 to −3.82 μV.
Figure 7
 
The original photograph of Lemmo's ambiguous cheetahs (Gerry Lemmo, 2006). The photograph was cropped for the experiment to allow a central presentation and a fixation cross near the necks of the cheetahs. See Figure 1C for the modified version used in the experiment.
Figure 7
 
The original photograph of Lemmo's ambiguous cheetahs (Gerry Lemmo, 2006). The photograph was cropped for the experiment to allow a central presentation and a fixation cross near the necks of the cheetahs. See Figure 1C for the modified version used in the experiment.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×