October 2003
Volume 3, Issue 9
Free
Vision Sciences Society Annual Meeting Abstract  |   October 2003
Multiple object tracking is scene-based, not image-based
Author Affiliations
  • Geniva Liu
    Dept of Psychology, Univ of British Columbia, Canada
  • Erin L Austen
    Dept of Psychology, Univ of British Columbia, Canada
  • Mark I Rempel
    Dept of Psychology, Univ of British Columbia, Canada
  • Kellogg S Booth
    Dept of Computer Science, Univ of British Columbia, Canada
  • Brian Fisher
    Dept of Computer Science, Univ of British Columbia, Canada
  • James T Enns
    Dept of Psychology, Univ of British Columbia, Canada
Journal of Vision October 2003, Vol.3, 330. doi:https://doi.org/10.1167/3.9.330
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Geniva Liu, Erin L Austen, Mark I Rempel, Kellogg S Booth, Brian Fisher, James T Enns; Multiple object tracking is scene-based, not image-based. Journal of Vision 2003;3(9):330. https://doi.org/10.1167/3.9.330.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Multiple object tracking (MOT) is the ability to individuate a moving object based solely on its spatial-temporal history. We examined whether MOT is based on a scene-based (allocentric) or image-based (egocentric) representation.

Observers viewed 16 objects moving in a depicted 3D wireframe box. On each trial, 2, 4, or 6 objects were briefly tagged as the ‘target’ class. All objects then underwent 10 s of random motion (1 or 6 deg/s) before stopping. A single object was then tagged, which the observer identified as a target or a non-target.

Preliminary experiments established that MOT was impaired by increases in both size of the target class and speed of object motion. Next, the motion pattern of the 3D box was manipulated. Thus, in addition to varying the speed of objects relative to the center of the box (object motion), the motion of the whole box was varied (scene motion). Unlike variations in object motion, which had a large influence on accuracy, variations in scene motion had no measurable influence. This was true whether the scene underwent translation, zoom, rotation, or even a combination of all three motions (‘combined motion’).

To tax the ability to use a scene-based representation, we projected the ‘combined motion’ condition onto an obliquely viewed surface. This created retinal motions of the objects and box consistent with an orthogonal view, but the apparent motions underwent large changes because of the affine stretching of the projected image. Nonetheless, MOT accuracy was unaffected. Accuracy was only reduced when we projected the ‘combined motion’ onto a convex corner formed from the junction of two surfaces, the same conditions under which pictorial shape constancy is no longer possible.

These results imply that MOT is accomplished with a scene-based representation. It is motion of objects relative to the larger scene that determines performance, not motion of objects relative to egocentric landmarks like retinal location.

Liu, G., Austen, E. L., Rempel, M. I., Booth, K. S., Fisher, B., Enns, J. T.(2003). Multiple object tracking is scene-based, not image-based [Abstract]. Journal of Vision, 3( 9): 330, 330a, http://journalofvision.org/3/9/330/, doi:10.1167/3.9.330. [CrossRef]
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×