This study examined the ability of human observers to discriminate between symmetric and asymmetric planar figures from perspective and orthographic images. The first experiment showed that the discrimination is reliable in the case of polygons, but not dotted patterns. The second experiment showed that the discrimination is facilitated when the projected symmetry axis or projected symmetry lines are known to the subject. A control experiment showed that the discrimination is more reliable with orthographic, than with perspective images. Based on these results, we formulated a computational model of symmetry detection. The model measures the asymmetry of the presented polygon based on its single orthographic or perspective image. Performance of the model is similar to that of the subjects.

*skewed symmetry*(Kanade, 1981; Kanade & Kender, 1983). Originally, the term skewed symmetry was used exclusively in the context of an orthographic projection. An orthographic projection is an approximation to a perspective projection, which is the correct model for the formation of images in the eye or camera. This approximation is good when the object is small compared to the viewing distance. More precisely, an orthographic approximation to a perspective projection is good when the range in depth of the object or figure is small relative to the viewing distance. In practice, it is usually assumed that “small” means less than 10%. Despite the fact that retinal images of symmetric figures are almost never themselves symmetric, skewed symmetry received much less attention in the prior psychophysical research than the case of symmetric images. Before we discuss prior psychophysical research on skewed symmetry, we briefly review the relevant geometry.

*σ*), tilt (

*τ*), and roll (

*ρ*). Slant is the angle between the observer's line of sight and the normal to the plane of the figure. Slant ranges between 0 and 90 deg. Tilt is the angle between the projection to the image plane of the normal of the plane of the figure and the

*x*-axis of the image plane. Tilt specifies the axis of rotation, around which the plane is rotated in depth; the axis of rotation is in the image plane and it is orthogonal to tilt. Tilt ranges between 0 and 360 deg. Roll is the angle of rotation of a 2D figure about the normal to the plane of the figure. Psychophysical experiments on skewed symmetry are usually done by means of computer graphics. Namely, instead of using physical figures slanted in depth, the subject is presented with perspective images of the figures shown on a computer screen. If the observer's eye is placed at the center of perspective projection that was used to compute the perspective images, the retinal image in the observer's eye produced by the perspective image of a slanted figure is itself a perspective image of the slanted figure. Let the line of sight of the subject be parallel to the

*z*-axis of the 3D Cartesian coordinate system. Let

*z*= 0 be the plane of the computer screen, called an image plane, and the

*x*- and

*y*-axes of the 3D Cartesian coordinate system be used as the 2D coordinate system on the image plane. Let (

*C*

_{ x},

*C*

_{ y},

*C*

_{ z})

^{ T}be the center of perspective projection (

*C*

_{ z}≠ 0) (recall that (

*C*

_{ x},

*C*

_{ y},

*C*

_{ z})

^{ T}is a column vector obtained by transposing (

*C*

_{ x},

*C*

_{ y},

*C*

_{ z})). When

*σ*= 0 deg, the plane of the figure is parallel to the computer screen and is represented by the following equation

*z*=

*z*

_{ f}, where

*z*

_{ f}is constant (

*z*

_{ f}can be set to zero, without restricting generality). Let the 2D figure be represented by a set of points (

*x*

_{2 D},

*y*

_{2 D},

*z*

_{ f})

^{ T}. Note that all these points have the same

*z*-coordinate when

*σ*= 0 deg. When slant

*σ*is not zero, the 3D coordinates of each point of the simulated 2D figure can be computed as follows:

*x*

_{3 D},

*y*

_{3 D},

*z*

_{3 D})

^{ T}in a 3D space to the 2D image plane (computer screen) can be computed as follows (see Figure 1):

*x*

_{ p},

*y*

_{ p})

^{ T}is a perspective image of (

*x*

_{3 D},

*y*

_{3 D},

*z*

_{3 D})

^{ T}.

*C*

_{ z}→ ∞). In such a case, Equation 2 takes the following form:

*x*

_{o},

*y*

_{o})

^{ T}is an orthographic image of (

*x*

_{3 D},

*y*

_{3 D},

*z*

_{3 D})

^{ T}.

*C*

_{ x},

*C*

_{ y},

*C*

_{ z})

^{ T}that was used to compute the perspective images. Only then will the retinal image in the observer's eye be a valid perspective image of the simulated 2D figure slanted in the 3D space. Otherwise, the retinal image will be a composition of two perspective projections, which is a

*projective*not a perspective transformation of the simulated 3D figure (Coxeter, 1987; Pizlo, Rosenfeld, & Weiss, 1997a, 1997b; Wagemans, Lamote, & van Gool, 1997). If an orthographic approximation to perspective projection is used, then the position of the observer's eye is irrelevant, as long as the line of sight is orthogonal to the computer screen. In such a case, the retinal image will also be an orthographic transformation (up to size scaling) of the simulated 2D figure. Again, the orthographic approximation to perspective projection is good when the viewing distance is large as compared to the range in depth of the simulated 2D figure.

*r*and

*θ*. Each polygon had 9 to 12 vertices. For an asymmetric polygon, the radius

*r*of each vertex was random in the range between 3.5 and 14.1 cm, and its orientation

*θ*was random in the range between 0 and 360 deg. The vertices were connected by a polygonal line in the order of increasing value of

*θ*. By doing this, we ensured that the polygon did not produce self intersections. For a symmetric polygon, half of the vertices were generated first in the same way as for an asymmetric polygon, but the orientations were restricted to the range between 0 and 180 deg. Next, mirror reflections of these points about the horizontal axis were generated and all vertices were connected as before. If the number of vertices was odd, one of the vertices was placed on the horizontal axis. If the number of vertices was even, none or two of them were placed on the horizontal axis. The origin of the polar coordinate system was placed at the center of the monitor. The dotted stimuli were generated the same way as the polygon stimuli, and the dots were placed at the vertices of the polygon. The stimulus occupied an area whose radius was at most 19.4 deg. We used large stimuli to make sure that perspective effects are clearly noticeable (see an example in Figure 2). For small stimuli, perspective projection becomes indistinguishable from orthographic projection. Note that if the fixation point were at the center of the figure, part of the stimulus would have been projected to the optic disk of the eye and thus remain invisible. To avoid this problem, the fixation point was shifted 9 deg to the right from the center of the monitor. The line of sight connecting the fixation point and the viewing eye was perpendicular to the monitor.

*d*′ and its standard error. The standard error was computed from two values of

*d*′ (recall that there were two sessions per each condition). Higher performance corresponds to higher values of

*d*′. Chance performance is represented by

*d*′ = 0. Perfect performance is represented by

*d*′ = ∞. The subject ran a number of practice sessions to become familiar with the experiment and stimuli. Each session started with a block of 16 practice trials. After the practice trials, the subject was informed about the proportion of correct responses and was given an option to repeat it. This option was rarely exercised. Before each session, the subject was told which condition would be tested.

*d*′.

^{1}The results were analyzed using a two-way ANOVA within-subjects design: figure type (polygon vs. dots) × angle between the symmetry axis and the tilt direction (0 deg vs. 45 deg vs. random).

*F*

_{1, 15}= 79.08,

*p*< 0.001). In fact, with dotted stimuli, performance was close to chance level for 45 deg and random angle conditions (

*d*′ is comparable to the standard errors). This means that subjects could not detect skewed symmetry in the case of perspective images of dotted patterns, although they could do this reliably in the case of polygons. The main effect of the angle between the axis of symmetry and the tilt was also significant (

*F*

_{2, 15}= 59.42,

*p*< 0.001). Specifically, performance in 0 deg condition was substantially higher than that in 45 deg and random angle conditions. This means that symmetry on the retinal image is easier to detect than skewed symmetry. Performance in 45 deg condition was the worst. This is because symmetry is maximally distorted when the angle between the axis of symmetry and the tilt is 45 deg (Wagemans et al., 1991, 1992). Finally there was a significant interaction between the figure type and the angle between the symmetry axis and the tilt (

*F*

_{2, 15}= 4.08,

*p*< 0.05). This interaction was due to the fact that the difference in performance between 0 deg and 45 deg conditions was large and significant in polygon condition (Tukey HSD:

*p*< 0.001) but smaller and non-significant in dots condition (

*p*= 0.055). This is most likely related to the floor effect; the performance in dots conditions was always quite poor.

*d*′. The results were analyzed using a two-way ANOVA within-subjects design (the random condition was not included in this analysis). The interaction was not significant (

*p*= 0.425), but both main effects were (

*p*< 0.001). In the case of the “direction” factor, performance in the main direction (horizontal or vertical) was higher than that in the diagonal direction. To evaluate the effect of the “known orientation” factor, a posteriori test (Tukey HSD) was applied. Performance in the two conditions where tilt direction was fixed was worse than performance in the two conditions where the direction of the projected symmetry axis was fixed (

*p*< 0.005) and where the direction of the projected symmetry lines was fixed (

*p*< 0.001). To compare the performance in these conditions to that in the random condition, a one-way ANOVA (

*F*

_{6, 12}= 28.90,

*p*< 0.001) followed by a posteriori test (Tukey HSD) was applied. Performance in the random condition was not significantly different from that where the tilt direction was diagonal (

*p*> 0.999), but it was worse than the other five conditions (

*p*< 0.05).

*not*a valid image of the slanted 2D figure. So, from the point of view of geometrical optics, perspective projection is

*better*than orthographic projection. But from the point of view of perceptual mechanisms of symmetry detection, this may or may not be true. Specifically, if the visual system uses properties of orthographic projection, then orthographic images are “better” than perspective images. Perspective images are only approximations. If, on the other hand, the visual system uses properties of perspective projection, then perspective images are “better” and orthographic images are only approximations. It follows that if the visual system uses the rules of perspective projection in detection of skewed symmetry, performance should be higher with perspective images. If the visual system uses the rules of orthographic projection in detection of skewed symmetry, performance should be higher with orthographic images.

^{2}It can be seen that the performance for orthographic projection was better than that for perspective projection (

*p*< 0.005), although the magnitude of this difference was rather small. Large difference was not expected simply because the perspective and orthographic images were similar to each other under the viewing conditions used here (an example is shown in Figure 2). The difference between perspective and orthographic images would have been larger, if either the viewing distance was smaller or the computer monitor was larger. The fact that orthographic images produced more reliable performance suggests that the visual system uses the rules of orthographic rather than perspective projection in detection of skewed symmetry. This result is important because it is perspective not orthographic projection that adequately describes the rules of image formation in the eye.

*p*< 0.001; interaction:

*p*< 0.05). The effect of “known orientation” has already been demonstrated and discussed in Experiment 2. The new result here is the interaction. The interaction effect seems to be produced by the fact that the effect of the type of projection is larger in the horizontal projected symmetry lines condition than in the other two “known orientation” conditions. To examine the interaction effect, a one-way ANOVA was applied to the difference of performance between the two types of projection (orthographic–perspective) for the three “known orientation” conditions (

*F*

_{2, 6}= 6.50,

*p*< 0.05). A posteriori test (Tukey HSD) showed that the difference in the horizontal projected symmetry lines condition was significantly larger than that in the random condition (

*p*< 0.05). These results suggest that the parallelism of the projected symmetry lines, as well as the orientation of the projected symmetry axis, is used by the visual system in the process of detection of skewed symmetry. Recall that the parallelism of symmetry lines is an invariant of orthographic but not perspective projection of a symmetric figure. Also, the projected symmetry axis connects the midpoints of the projected symmetry line segments under orthographic, but not under perspective projection.

*n*vertices. This polygon has

*n*possible placements of the projected symmetry axis. If

*n*is odd, each possible symmetry axis crosses (

*n*− 1) / 2 symmetry lines including a side of the polygon. If

*n*is even, each possible symmetry axis crosses

*n*/ 2 symmetry lines including two sides of the polygon or (

*n*− 2) / 2 symmetry lines. Correspondence of pairs of symmetric vertices is uniquely specified for each possible symmetry axis: starting from an intersection of a possible symmetry axis and a polygon, the

*n*th vertex in a clockwise direction should form a skewed symmetric pair with the

*n*th vertex in a counterclockwise direction.

^{3}(Note that in the case of dotted stimuli, an additional process for finding a skewed symmetric partner of each dot would be needed.) The model examines parallelism and collinearity of midpoints of the projected symmetry line segments for each possible symmetry axis. Then, it chooses the symmetry axis and the corresponding symmetry line segments that maximize the parallelism and the collinearity of midpoints.

^{4}A symmetric polygon and its orthographic image are related by a 2D affine transformation. It follows that a reconstruction of a symmetric figure must use a subset of a 2D affine transformation.

R ^{2} | Coefficient | Intercept | |
---|---|---|---|

TS | |||

(i) | 0.865 | 0.957 ± 0.038 | 0.403 ± 0.026 |

(ii) | 0.954 | 1.005 ± 0.022 | 0.070 ± 0.018 |

(iii) | 0.939 | 1.013 ± 0.026 | 0.207 ± 0.077 |

(iv) | 0.207 | 1.592 ± 0.315 | 0.195 ± 0.084 |

(v) | 0.249 | 0.922 ± 0.162 | 0.207 ± 0.077 |

(vi) | 0.954 | 1.005 ± 0.022 | −1.323 ± 0.042 |

(vii) | 0.954 | 1.005 ± 0.022 | 1.463 ± 0.027 |

| |||

OK | |||

(i) | 0.853 | 0.863 ± 0.036 | 0.125 ± 0.009 |

(ii) | 0.887 | 0.958 ± 0.034 | 0.006 ± 0.010 |

(iii) | 0.902 | 0.941 ± 0.031 | 0.065 ± 0.008 |

(iv) | 0.112 | 1.188 ± 0.338 | 0.244 ± 0.068 |

(v) | 0.107 | 0.554 ± 0.162 | 0.273 ± 0.064 |

(vi) | 0.887 | 0.958 ± 0.034 | −1.305 ± 0.064 |

(vii) | 0.887 | 0.958 ± 0.034 | 1.343 ± 0.039 |

*n*possible orientations, where

*n*is the number of vertices in a polygon.

*d*′ = 3.09 ± 0.060) and perspective projection (

*d*′ = 2.54 ± 0.018). The

*d*′s were calculated based on the criterion for which the hit rate was equal to the correct rejection rate. Note that performance for orthographic projection is slightly better than that for perspective projection.

*d*′s for all seven models are shown in Figure 13. It can be seen that performance of the first five models is equally good, but that of the last two models is somewhat lower. These results show that evaluating asymmetry of a given shape does not depend strongly on how the shape is reconstructed. The rest of the simulations uses model (ii) of symmetry reconstruction.

*d*′s of our model shown in Figure 13 (model (ii)) to

*d*′s of our model shown in Figure 9, random condition. Performance of the model is about twice as high as that of the subjects. Recall that our model performs an exhaustive search for the projected symmetry lines and symmetry axis. It is reasonable to expect that subjects did not perform exhaustive search, considering the fact that exposure duration was short (100 ms). Indeed, fixing the orientation of symmetry lines or symmetry axis led to substantially better performance (see Figure 9). This result strongly suggests that without knowing these orientations, the visual system tries only a few of them. In order to evaluate the effect of the amount of search for the orientation of the projected symmetry lines on performance, we performed the next simulation experiment. We tested 10 conditions, corresponding to the number of possible orientations of the projected symmetry lines that were tried. This number ranged from 1 to 9, plus an exhaustive search. The actual orientations were chosen randomly, and the best was used to measure the asymmetry of a polygon.

*d*′, and the abscissa shows the number of orientations that were tried as possible orientations of projected symmetry lines. As expected, performance is better when the search involves more orientations. For eight orientations, the performance of the model (

*d*′ = 1.35 ± 0.040 under the perspective projection, and

*d*′ = 1.65 ± 0.086 under the orthographic projection) is closest to the average performance of human subjects in the random condition (see Figure 9).

- the presence of the orientation information of contours in the case of polygons and
- the information about the order of vertices in the case of polygons.

^{5}To make the characteristics points stable, the curvature function was smoothed out by a local smoothing operator. The numbers below the shapes represent the measure of asymmetry produced by our model. When these numbers are compared to the criterion that our model used for discriminating between skewed symmetric and asymmetric polygons (Figure 12), it is seen that the shapes in Figure 17 would all be classified as skewed symmetric. Once we know that the model can be applied to smooth contours, the next question is whether the model can be applied to contours of 3D smooth surfaces and 3D volumetric objects. The answer is “probably yes,” as long as the contours are piece-wise planar. Testing human performance in detecting symmetry of 3D surfaces and 3D shapes, and generalizing the model to such stimuli will be addressed in our future work.

^{2}Results of subject ZP in Experiment 2 were used as his results in the perspective condition in this experiment because he already reached an asymptotic performance. TS ran the perspective conditions again because his performance improved, somewhat, compared to Experiment 2. YL ran all conditions because he was not tested in Experiment 2.

^{4}A related question about the types and order of affine transformations determining the shape percept was studied by Wagemans, Vanden Bossche, Segers, and d'Ydewalle (1994).