From a theoretical point of view, the use of the shading cue involves estimates of the light field and thus observers need to judge the light field and the shape simultaneously. The conventional stimulus in perceptual experiments, a circular disk filled with a monotonic gradient on a uniform surround, represents a local shading or tonal gradient. In typical scenes, such gradients vary smoothly from point to point over large areas, whereas light fields are globally defined and tend to be invariant over large parts of the scene. Hence, it is hardly surprising that multi-local shape estimates tend to synchronize although previous reports of such synchronies involved uniform, homogeneous light fields. Here, we consider more complicated and more realistic light fields. We present extensive, highly structured, quantitative observations using novel paradigms. Human observers are able to deal with some structured light fields but totally fail in others, even though these may be formally similar (like radial and circular fields). Observers respond very differently in some cases where the light fields differ only by sign, like converging and diverging fields. These results can be qualitatively understood on the basis of a few simple assumptions, mainly global top-down template matching of peripheral local data.

*modulo*a group of ambiguities (Belhumeur, Kriegman, & Yuille, 1999). These ambiguities include absolute distance, overall spatial attitude (“additive plane”), and depth of relief (“bas-relief ambiguity”; Belhumeur et al., 1999). Such ambiguities also occur in human perception (Brewster, 1832; Hill & Bruce, 1994; Kleffner & Ramachandran, 1992; Ramachandran, 1988a, 1988b; Rittenhouse, 1786).

*chiaroscuro*(or

*clair-obscur*), that is, the art of shading. They would also receive some verbal instruction and lectures from some master to get them on their way. Eventually, they would graduate to the life drawing classes. Nude models are less convenient study objects than casts because they move (at least somewhat) and have non-Lambertian surfaces. (Most real surfaces are non-Lambertian; Dana, van Ginneken, Nayar, & Koenderink, 1999.) The posing stage would be illuminated in various standard, well-designed manners, a bit like modern Hollywood studios. The “shading cue” was indispensable because of a certain way to add “relief” to otherwise flattish (cartoon) drawings. Literature on the “reception of the light” starts with Alberti's “De Pictura” (http://www.noteaccess.com/Texts/Alberti) and Leonardo's notebooks (http://www.gutenberg.org/etext/5000). The truly scientific literature dates from the early twentieth century and is mainly in the phenomenological, Gestalt tradition (Luckiesh, 1916; Metzger, 1975; Turhan, 1935). Modern work (Palmer, 1999; Ramachandran, 1988a, 1988b) has adopted this. Perhaps unfortunately, attempts to use shape-from-shading algorithms as models for human visual abilities have proved unsuccessful.

^{1}The one feature that detracts from its success is the sharp circular outline that has nothing to do with shading but represents a cue in its own right (Cate & Behrmann, 2010; Hayakawa, Nishida, Wada, & Kawato, 1994; Humphrey, Symons, Herbert, & Goodale, 1996, Norman & Raines, 2002). Depending on whether it is seen as an occluding contour, a dihedral edge, or a surface boundary, one is aware of a sphere (necessarily convex), a local depression (“cup”) or protrusion (“cap”) in a plane, or a hemispherical cup with free surface boundary (Figure 1). One way to remove this confounding cue is to blur the edge, but the result is that most observers fail to see a well-defined surface then (Erens, Kappers, & Koenderink, 1993).

^{2}mainly because it did not conveniently fit our experimental paradigm. The most common light fields in nature are the uniform and divergent cases, with the convergent case as a rare third (Mury, Pont, & Koenderink, 2007). Other fields (like the deformation) are extremely rare. Sunlight exemplifies the uniform case, a local source exemplifies the divergent case (this could be due to a local light spot, common on forests), and a convex object seen against a broad, diffuse source (portrait taken against translucent curtains) exemplifies the convergent case. The other cases are not ruled out by physics but require skillful laboratory setups (Mury, Pont, & Koenderink, 2009a). We included the cyclical cases as typical examples of light fields with negligible ecological importance.

- In the initial condition, only the large, uniformly gray pedestal with fixation cross appears. The observer is free to choose the time to trigger the next event, after having established strict fixation of the center mark. After triggering the next event, the image does not change for a fixed (short) period (250 ms).
- The actual configuration appears and remains on for a short period (2 s). The observer maintains fixation and notices whether either one of three mutually exclusive cases enters visual awareness: All disks appear as caps (response “convex”), all disks appear as cups (response “concave”), or a mixture of caps and cups appears (response “different”).
- After the set period of 2 s, the six disks disappear and the observer is left with the pedestal with fixation cross. In the interface, a group of three radio buttons marked “convex,” “concave,” and “different” appears. The observer selects the appropriate one and next hits the “done” button, which concludes the trial. The duration of this period is up to the observer.
- The system has returned to the initial condition, and the observer may take a rest or trigger the next trial.

- In the initial condition, only the large, uniformly gray pedestal with fixation cross appears. The observer fixates the mark and triggers the next event.
- The actual configuration appears and remains on for a short period (1 s). The observer keeps on fixating. This period is used to allow the generation of a visual awareness of the pattern.
- The configuration stays on, but a red mark appears in one of the disks so as to mark it as the
*first target*. The observer is supposed to remember the shape (cap or cup) of the marked disk. This period is only short (500 ms). - The configuration stays on, but the first red mark disappears and a second blue mark appears in one of the disks so as to mark it as the
*second target*. The observer is supposed to remember the shape (cap or cup) of this disk. It may occasionally happen that the first and second targets are the same. This period is again a short one (500 ms). - After the set period, the six disks (as well as the second mark) disappear and the observer is left with the pedestal with fixation cross. In the interface, two groups of two radio buttons marked “convex” and “concave” appear. The groups are marked “first” and “second.” The observer selects the appropriate button in each group and next hits the “done” button, which concludes the trial. The duration of this period is up to the observer.
- The system has returned to the initial condition, and the observer may take a rest or trigger the next trial.

*p*< 0.05 (unless reported otherwise).

*N*= 14).

*p*< 0.0001).

*U*that is zero for the case of no preference.

^{3}Positive values imply a preference for light from above, and negative ones imply a preference for light from below. There were only two observers in the latter category. The median index value was 1.435, a very significant preference for light from above, and the inter-quartile range is 0.707–2.80, with the extremes −0.771 and 3.138. Thus, there is a rather broad spectrum of preferences among our observers.

*U*index above the median value, the median direction indicated a slight preference for directions toward the left, 16° for the convex, and 6° for the concave responses.

^{4}over the circle of targets (i.e., mean values across the same or different pairs). This autocorrelation function over the pooled data is shown in Figure 11 (all observers,

*N*= 8).

*p*< 0.0001; median value: 0.90, inter-quartile range: 0.86–0.94). Thus, the response to the first target has a high predictive value for the response to the second target independent of the mutual distance. This clearly reflects the informal observations.

*p*< 0.0001); for the clockwise condition, this is true for the antipode and its two neighbors (all

*p*< 0.0001). This again reflects the informal observation (Figure 6, right) that the disk that is opposite to the fiducial one tends to be of opposite type (cap or cup as the case may be).

*p*< 0.005), except at the fiducial position where none of the values differ.

^{5}The half-life of a certain awareness (cap or cup) then would amount to at least 82 s.

^{6}This seems reasonable given our observations. Informal observations appear to suggest a much shorter half-life time, but this is no doubt due to eye movements. With strict fixation, the spontaneous reversal rate is much slower. Note, however, that the estimate of the rate of spontaneous reversals is rather uncertain. The lowest value of the self-correlation is 0.8333, implying a rate of 0.0833 s

^{−1}(a half-life time of 8.3 s), which is an order of magnitude larger than the earlier estimate of 0.0085 s

^{−1}.

*Q*. Then, the probability to see all six disks as the same (either cup or cap) is

*P*=

*Q*

^{6}+ (1 −

*Q*)

^{6}.

^{7}Empirically, the value of

*Q*(from the mean over all observers) is approximately 3/4, implying

*P*= 0.1782. Empirically, we find

*P*= 0.136, which would correspond to

*Q*= 0.717. Given the spread in individual observer data, this is close enough to conclude that the six disks are essentially independent in visual awareness. This is, of course, fully corroborated by the autocorrelation function, which is not significantly different from zero for finite separations.

^{−1}. The probability of 0.9291 would imply a spontaneous reversal rate of 0.0236 s

^{−1}.

^{8}Given the uncertainty in these estimates (see above), these values appear reasonable.

^{9}Apparently, there is a strong tendency to “synchronize” in the case of the uniform light field.

*blackness*from the inside, but this is not something we can test on the basis of the data.

*field lines*of the light field then replace the

*rays*of the collimated beam. The difference is important. In empty space, the rays are straight lines, whereas the field lines are generally curved and may even be closed (Mury et al., 2007, 2009a).

*both*the source and the object. Both direction and magnitude of the light field vary over the surface of the sphere. All of the surface is illuminated, except for the single point where the surface does not face the source. This is in contradistinction to a sphere illuminated by a parallel, collimated beam. In that case, half of the object is in shadow (think of the moon illuminated by the sun). This is exactly the reason why most portrait photographers prefer extended sources and avoid collimated beams. Computer graphics “solves” this problem with collimated beams by adding an “ambient component.” This formally works, though only for convex objects, and at the cost of a violation of physics. It is a source of frequent misunderstandings (Koenderink & van Doorn, 1996).

- uniform light flow is the dominating pattern, both in space and on surfaces (the corncob scene in Figure 3 (upper right) is a case in point);
- the probability of a single object being at a space singularity of the flow is very low, but an extended configuration of objects is not unlikely to straddle such a location (the candle scene in Figure 3 (lower right) is a case in point);
- convergence and divergence patterns in surface flows are typical (the window scene in Figure 3 (left) shows a convergent flow pattern), whereas whirl patterns are impossible.

*meaningful,*the meaning being imposed by the agent. The “hallucinations” have the same format as the data in the buffer, thus the probing is a simple matter of comparison. In the model, the format is simply an ordered list of six directions, and the comparison finds the mean of the matches, a match being defined as the cosine of the relative directions, a number between +1 (same directions) and −1 (opposite directions). The agent comes up with many hallucinations and, at any moment, favors the one that fits best (Selfridge, 1959). This competition among generations of hallucinations is akin to biological “survival of the fittest.” The survivors make up “visual awareness” (Hoffman, 2009). From an algorithmic point of view, the model is similar to the “harmony search” algorithm of “soft computing” (Geem, Kim, & Loganathan, 2001).

*perfect*correlations because there are no spontaneous cap-to-cup flips; on the other hand, the correlations for the convergence are lower than those for the uniform case because here the cup-to-cap flips have a huge influence, most responses being of the “cup” type. The autocorrelation for the cyclical configurations are especially interesting since we see an anti-correlation peaked at the antipodal of the fiducial position. This is due to the fact that the best the model can do is apply a uniform template, with the result that the configuration perceptually splits into two parts.

*gradient*. Thus, “local shading” is a

*point property*even though the image intensity spatially varies. It is formally similar to the notion of “velocity,” which is a measure of temporal change at a single moment. In contradistinction, “multi-local properties” depend on the structures simultaneously encountered at mutually distinct points. Mathematical analysis in the local and multi-local cases has to be categorically different. Similarly, in neurophysiology, “local” may be taken to refer to the single receptive field case, whereas “multi-local” would refer to receptive field assemblies. Thus, “multi-local” presupposes local sign, whereas “local” does not. These distinctions are crucial in any formal account of spatiotemporal phenomena.

_{60°}

^{+}denote the number of “convex” responses at 60°, R

_{30°}

^{−}the number of concave responses at 30° and so forth. Then, we define the “preference for light from above index

*U*” as

*R*(

*φ*) is defined as the expectation of

*F*(

*θ*+

*φ*)

*F*(

*θ*) over 0 ≤

*θ*< 360°, where

*F*(

*θ*) = +1 for convex responses and

*F*(

*θ*) = −1 for concave responses. Of course, the parameter is periodic, thus

*θ*+

*φ*is reckoned modulo 360°. We reckon −180° ≤

*φ*< +180°; in the experiment, the angle

*φ*takes only the discrete values of −180°, −120°, −60°, 0°, +60°, and +120° (where +180° repeats −180°).

*ab initio*probability to see a cap is

*Q,*and that of seeing a cup is 1 −

*Q*. Empirically,

*Q*≈ 3/4. Let the probability that a cap will spontaneously flip to a cup within the exposure period of 1 s be

*A*; likewise, let the probability that a cup will spontaneously flip into a cap be

*A*. The probability of reversal

*A*is evidently very small; reversals are rarely noticed under conditions of strict fixation. The self-correlation

*R*is the mean of four possible events, namely cap–cap, cup–cup, cap–cup, and cup–cap. The former two events lead to a value of +1, and the latter two events lead to a value of −1. The probabilities of the four events are

*Q*(1 −

*A*), (1 −

*Q*)(1 −

*A*),

*QA,*and (1 −

*Q*)

*A,*respectively, thus one has

*R*=

*Q*(1 −

*A*) + (1 −

*Q*)(1 −

*A*) −

*QA*− (1 −

*Q*)

*A*= 1 − 2

*A*. Empirically,

*R*= 0.983 … 1 (inter-quartile range), leading to

*A*< 0.0085 s

^{−1}.

*t*/

*T*) ≈ 1 −

*t*/

*T,*we have

*t*/

*T*≈

*A,*with

*t*= 1 s, thus

*T*= 1/

*A*. The half-life time is log 2 ≈ 0.693 times as large. One finds a half-life time of at least 82 s in case spontaneous reversal occurs symmetrically (see Footnote 5).

*ab initio*probability to see a cap be

*Q*and that to see a cup be (1 −

*Q*). The probability to see six caps is

*Q*

^{6}, and that to see six cups is (1 −

*Q*)

^{6}. Thus, the probability to see six equal entities is

*P*=

*Q*

^{6}+ (1 −

*Q*)

^{6}, a symmetrical function of

*Q,*being 2

^{−5}≈ 0.03125 at

*Q*= 0.5 and 1 at

*Q*= 0 or 1.

*S*. The correlation is expected to be

*R*=

*S*− (1 −

*S*) = 2

*S*− 1. Empirically,

*R*= 0.9, implying

*S*= 0.95. Let the probability to see a single item as cap be

*Q,*and assume the two locations are independent. Then, the probability of cap–cap is

*Q*

^{2}, and that of cup–cup

*is*(1 −

*Q*)

^{2}; thus, the probability to see the two as equals is

*P*=

*Q*

^{2}+ (1 −

*Q*)

^{2}. Empirically,

*Q*≈ 3/4, implying

*P*= 0.625. This is much smaller than the value of

*S*(which is 0.95, see above).