In natural viewing, humans continuously vary accommodation and eye position to bring into focus on the fovea an image of what is currently being fixated. For an emmetropic or refraction-corrected visual system, all objects in the visual field that are at the accommodative distance will, in first approximation, form sharp images on the retinae. Images of objects that are closer or farther than the accommodative distance will instead be out of focus, and these objects will be imaged with increasing blur with depth away from the plane of fixation (
Equation 1,
Figure 1). Blur can be defined as the diameter of the circle
C over which the point at
Z1 is imaged at the retina when the lens is focused at distance
Z0.
C can be defined as (Held, Cooper, O'Brien, & Banks,
2010)
where
A is the pupil diameter and
s is the distance from the lens to the retina, i.e., the posterior nodal distance.
The visual experience when using manufactured entertainment devices (computers, television, books, newspapers) is very different from natural viewing conditions. While natural conditions contain a broad and continuously varying range of visual depth information, manufactured displays usually contain a narrow range of fixed depth. For example, many artificial images contain little or no blur, because they are designed for a user to be able to extract information from all areas of the image (e.g., when using a desktop application or when playing a videogame). In reading a book or using an e-reader, the pages are sharp all over, and the accommodative distance is much closer to us than in typical natural conditions, especially those outdoors (Geisler,
2008). In directed movies and television, spatial blur is often manipulated to induce the viewer to attend to specific, in-focus portions of the scene (Katz,
1991; Kosara, Miksch, & Hauser,
2001). Such conditions differ from the real world, where the ranges of depths at which our eyes accommodate produce variation in retinal blur across space and over time. Furthermore, stereoscopic 3-D is increasingly used to promote the illusion of depth, but current display technology nearly always presents depth information uncoupled from focus information. Away from fixation, however, blur can be a more precise cue to depth than binocular disparity, and the visual system appears to rely on the more informative cue when both are available (Held, Cooper, & Banks,
2012).
The presence of defocus blur has been shown to diminish visual fatigue during viewing of stereoscopic 3-D stimuli (Hoffman, Girshick, Akeley, & Banks,
2008). We were interested in simulating, via image processing, the changes in peripheral blur due to naturalistic accommodative changes. Our aim was to develop a novel method of presenting both depth and blur information in a way that simulated natural viewing conditions. We implemented a real-time gaze-contingent stereoscopic display with naturalistic distributions of blur and disparity across the retina. At fixation, the display is kept in focus, analogously to foveation and accommodation in the real world. In peripheral vision, images are presented with varying amounts of blur, increasing with distance from the simulated depth plane of fixation. To control dioptric blur, we took advantage of Light Field Rendering photographic technology (Ng,
2006; Ng et al.,
2005). We employed light-field photographs of natural scenes taken with a Lytro™ plenoptic camera (Lytro Inc., Mountain View, CA;
Figure 2a). Light-field cameras output for a single photographic exposure an “image stack” with images focused at different depths, a depth map of the captured scene (
Figure 2b), and a depth lookup table. Although this system does not approximate the optical aberrations of an individual, it provides a general-purpose method to simulate changes in blur and depth that may be implemented in a practical, low-cost system. In this article, we examine whether any functional benefit can be obtained with such a general approach.
While a subject freely viewed such photographs, we used an eye tracker to monitor the point of fixation and we determined the depth of this location from the light-field image's depth map. With this implementation, we then rendered the appropriate image dioptric blur for all other points in the image in real time by selecting the appropriate image from the light-field image stack. Thus, we allowed for the distribution of blur across the retina to be controlled in real time, which simulates a spatial and temporal distribution of image blur similar to that which occurs under free viewing in natural conditions. We used this display to examine how eye movements and binocular fusion depend on depth cues from blur and stereoscopic disparity in naturalistic images.