The normative approach for scientific experiments is to start from a strong theory in which a number of independent variables (e.g., stimulus size), predicted to influence a dependent variable (e.g., some measure of the percept), are specified in a conceptually coherent framework (
Figure 1, hourglass). Subsequently, a critical design is chosen in order to test a hypothesis and advance the predefined theory by generating data that allow us to confirm or reject the hypothesis. This critical test typically focuses on a narrow range of manipulations of the independent variables. Accordingly, the hypothesis is tested as rigorously as possible, and conclusions can be generalized as broadly as the theory allows (
Debrouwere & Rosseel, 2021;
Meehl, 1990). This approach is optimal when you are dealing with a strong theory, meaning that the process studied is already well known and/or precise predictions can be made. However, there is considerable debate about the robustness of established knowledge of several cognitive functions (
Davis-Stober & Regenwetter, 2019;
Debrouwere & Rosseel, 2021;
Fisher, Medaglia, & Jeronimus, 2018;
Greenwald, Pratkanis, Leippe, & Baumgardner, 1986;
Open Science Collaboration, 2015). Processes studied in behavioral science are influenced by multiple interacting factors, making it extremely difficult to generate robust theories and make precise predictions (
Debrouwere & Rosseel, 2021;
Open Science Collaboration, 2015).
Although vision science is known for relatively strong established knowledge and theory, many of the broader processes studied in vision science can be similarly challenging for generating robust theories and making precise predictions. Visual perception is the result of a highly complex process depending on both stimulus and observer characteristics and, importantly, their interactions. There are ample examples of studies in the literature reporting on the influence of stimulus or observer properties on perception. In addition to a wide variety of studies in which different stimuli are linked to different percepts in the average observer, research has shown that considerable individual variability exists when perceiving the same visual stimulus, as well. A good illustration of individual variability is provided by perceptual multistability, where the same visual stimulus elicits different percepts (usually two or three) when viewed multiple times or for an extended period of time (e.g., a few minutes). Accordingly, there is intra-individual variability in the perception of the same stimulus. Inter-individual differences in these perceptual dynamics have been well documented as well, and observers show reliable signatures in their “switching behavior” (e.g., the parameters of an observer's dominance duration distribution) (
Brascamp, Becker, & Hambrick, 2018;
Dieter, Sy, & Blake, 2017;
Gallagher & Arnold, 2014).
Even well-known phenomena have been studied almost exclusively with an emphasis on either stimulus-related or observer-related effects in isolation. Indeed, keeping either the observer or the stimulus constant (or treating it as such) while manipulating the other can be a great tool to study perception. However, studying their interaction is crucial to further advance our understanding of visual perception. We may find that some interaction effects due to variability in stimuli and observers are not negligible. Yet, in order to focus on either stimulus or observer characteristics, the influence of observer or stimulus characteristics, respectively, and their interplay are often neglected or understated in research where actually both are (or should be) independent variables (
Mollon, Bosten, Peterzell, & Webster, 2017). One may assume that the variability of the dependent measure (e.g., variability of percepts in different observers or variability in a stimulus set) is normally distributed, leading to the notion that the average describes the individual (stimulus or observer), although this is potentially not the case (
Charest & Kriegeskorte, 2015;
Davis-Stober & Regenwetter, 2019;
Fisher et al., 2018;
Wijnants, Cox, Hasselman, Bosman, & Van Orden, 2013). For example, stimulus-related effects of the average observer may not be informative for understanding any one observer in particular (
Fisher et al., 2018). Alternatively, a sample may be taken from a very homogeneous group (e.g., all similar observers at one point in time or only one particular stimulus type), causing the variability in the sample to be normally distributed but compromising generalizability and making it potentially unrepresentative of the complexity of the processes that are studied. For example, findings about the perception of well-known stimuli may not be replicated when the stimulus is only slightly different (
Wijnants et al., 2013). Similarly, as visual perception is a dynamic process, seemingly robust stimulus effects for one observer may change over time as well (
Friston & Kiebel, 2009;
Hamaker, 2012;
Koenderink, 2019;
Molenaar, 2004).
Perhaps some of our theories are not yet capturing the essence of the broader studied process, and perhaps we prevent further research progress when we keep measuring our dependent variables at particular isolated points in a non-uniform space that is influenced by multiple independent variables in ways that we are currently unable to predict. Think of trying to study a complex mechanism M in a group of individuals. We know that M has fluctuations due to internal factors and is influenced by external factors. Imagine that we measure the influence of a specific range of one external factor (e.g., 10–20 units) on M at a given point in time. At another point in time, we measure fluctuations in M due to internal factors. From these isolated studies, how should we predict M in the future? Some (curvi-)linear relationship between the external factor and M may exist within the scope of the studied units of the external factor (i.e., 10–20 units), but the effect of the external factor and its interactive effects may be very different outside this specific studied range. Even if we know what to expect on average within the studied scope of the external factor, internal factors or other external factors may still override the effect of the external factor at another point in time. Similarly, we may have an idea of the fluctuations of M due to internal factors, but surely this does not allow us to make precise predictions of M when the effect of an external factor is very strong. Moreover, internal factors and their interactive effects with external factors may depend on which different individuals we study and may differ considerably from one individual to another. Therefore, even though we considered multiple influencing factors in isolation, it might still be close to impossible to replicate findings or predict M precisely in the future. Alternatively, we could attempt to study the influence of a wider range of external factors, while simultaneously considering internal factors in multiple individuals. Accordingly, we can get a broader view of the complex mechanisms determining M. Importantly, we do not encourage drawing strong conclusions at the end of such an exploratory approach. These findings, because they are not the result of a critical test for a specific hypothesis, should be put to the test further on.
Likewise, when we study the effects of different stimulus and observer properties in isolation, and we only consider a narrow range of the stimulus and observer space, we may be missing valuable information. This may lead to theories lacking in explanatory or predictive value for particular observers and a wider range of stimuli. When our theories and knowledge of visual phenomena cannot account for this variability, difficulties in replication arise and the robustness of theories is compromised. In preparation for the current study, we considered multiple research questions concerning the perception of a multistable stimulus dependent on stimulus and observer properties, with corresponding potential operationalizations and designs according to the “hourglass” approach. However, we were confronted with its limitations in this context as mentioned above and concluded that it would be difficult to provide a critical test associated with a strong theory. Therefore, we found it worthwhile to consider a more exploratory approach, which may provide us with information on stimulus–observer effects that were not studied in combination before. This information may, in turn, shape our theories to better account for even main effects of simple stimuli. It implies that we will attempt to study a complex system as holistically as we can (instead of studying the effects of its components in isolation) in order to understand the complex system as a whole, as well as the effects of its components. Instead of adopting the standard hourglass approach (in which we start from a strong, broad theory, test precisely, and have strong, generalizable conclusions), we will start from a limited theory, test broadly, and have exploratory findings, which should be followed up by further research for replication, extension, and refinement (
Figure 1, reverse hourglass).