To understand performance in measures of short-term memory binding, both memory representation and cognitive processing in the task should be considered. Different kinds of memory representations have been postulated for VSTM. Feature-bound memory representation (Luck & Vogel,
1997), unbound memory for features (Wheeler & Treisman,
2002), memory for configuration (Jiang, Olson, & Chun,
2000), and possibly others may be components of VSTM. Memory for feature binding has often been discussed under the notion of object files (Kahneman, Treisman, & Gibbs,
1992), so this study uses the term “object files” to refer to memory for feature bindings in general. In this work, we distinguish two types of object files. The first is a complete object file, representing the complete set of features of an object. For example, if an object is presented with different shapes and colors, the complete object file comprises a conjunction of shape, color, and location. The second is a partial object file, representing a partial set of features. Using the example above, a partial object file would comprise a conjunction of color and location, shape and location, or color and shape. The present study utilized objects defined by a combination of location, shape, and color, so a complete object file corresponds to memory for triple conjunctions (location, shape, and color), and partial object files correspond to memory for single conjunctions.
Considering the cognitive processing involved in the task, performance may reflect encoding, maintenance, retrieval, and/or comparison. A deficit in a memory task may thus be caused by a limit in storage capacity or by a bottleneck in memory retrieval and/or comparison between memory and perceptual representations. In the present study, memory retrieval and memory comparison could not be experimentally distinguished, so we have used the term memory retrieval to refer to both in the following.
Some studies have suggested that low estimated capacity for feature binding memory relative to feature memory may reflect differences in memory retrieval. Wheeler and Treisman (
2002) compared the single-probe paradigm, where only one object was presented in the probe display to be judged for the presence of change, with the multiple-probe paradigm, where the whole probe display needed to be compared with the initial display. They showed that the single-probe condition significantly improved performance in the binding condition compared to the multiple-probe condition. This improvement in task performance can be interpreted as a reduction of interference and/or a facilitation of memory retrieval by the single probe. Allen, Baddeley, and Hitch (
2006) reported similar results.
The single probe advantage in the binding condition leaves some questions open regarding the nature of representation and processing in VSTM. First, the findings of Wheeler and Treisman (
2002) do not necessarily imply that memory for object files in general suffer from a retrieval bottleneck in the multiple probe condition. Wheeler and Treisman investigated complete object files from color–location conjunction stimuli and partial object files (shape and color) from triple conjunction stimuli, so whether a single probe advantage is observed with complete object files from triple conjunction stimuli remains unknown. If the previous findings reflect the nature of object files in general, the single probe advantage should be observed with complete object files from triple conjunction stimuli. Conversely, if the previous findings hold true only in certain special situations, the single probe advantage may be limited to partial object files from triple conjunction stimuli.
To address this issue, however, the simple change-detection task with triple conjunction stimuli is insufficient because triple conjunction representations are not necessary to simply detect a change. For example, suppose objects with color and shape are represented as independent sets of two partial object files: color–location and shape–location. This representational scheme is sufficient to detect a change with triple conjunction stimuli by monitoring a change in two sets of partial object files independently. To deal with this problem, Saiki and Miyatsuji (
2007) devised a type-identification paradigm. Following the logic of perceptual feature binding studies (Treisman & Schmidt,
1982), the type-identification paradigm requires information on triple feature combination, not just simple conjunctions. Unlike simple change detection, type identification requires the discrimination of sources of change by asking participants to identify a type of switch event. This forces participants to take into account the triple conjunction of shape, color, and location. This task thus allows evaluation of memory for more complex feature representations. Saiki and Miyatsuji applied this task to an experimental paradigm called multiple object permanence tracking (MOPT), which is similar to the binding conditions described by Wheeler and Treisman (
2002) in logic, but is also able to evaluate any effect of object motion (
Figure 1). A series of experiments revealed that (1) even when objects are stationary, task performance was quite poor; and (2) object motion further impaired performance, even if motion speed was slow. Memory capacity estimated using a standard formula with a proper modification (see
1) was only about 1.5 objects when stationary and 1 object when moving. Earlier studies (Saiki,
2003a,
2003b) showed that this impairment by object motion is not simply due to failure in object tracking but reflects an additional cost of spatiotemporal updating of feature binding. MOPT is suitable to investigate spatiotemporal updating, which is an important characteristic of object file representation.
The type-identification paradigm used in the present study can evaluate memory for object files more properly, but whether deficits in task performance reflect retrieval or maintenance per se remains elusive. Type-identification performance may be impaired not because object files have smaller maintenance capacity than other types of VSTM, but because memory retrieval is more difficult for object files than for unbound features, as suggested by Wheeler and Treisman (
2002).