June 2004
Volume 4, Issue 8
Vision Sciences Society Annual Meeting Abstract  |   August 2004
What do we see when we glance at a scene?
Author Affiliations
  • Li Fei-Fei
    California Institute of Technology, USA
Journal of Vision August 2004, Vol.4, 863. doi:https://doi.org/10.1167/4.8.863
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Li Fei-Fei, Christof Koch, Asha Iyer, Pietro Perona; What do we see when we glance at a scene?. Journal of Vision 2004;4(8):863. https://doi.org/10.1167/4.8.863.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

What exactly do we see when we glance at a natural scene? And does what we see change as the glance becomes longer? We asked naïve subjects to report what they saw in briefly presented photographs. Our subjects received no specific information as to the content of each stimulus, and were asked to report what they saw in free-form text. Thus, our paradigm differs from previous studies where subjects were either cued before a stimulus was presented, and/or were probed with multiple-choice questions. Our experiment consisted of two stages. First, a group of 22 native-English speaking subjects were shown 100 novel gray-scale photographs foveally. The photographs contained a broad sample of real-life scenarios both indoor and outdoor. Each presentation time was chosen at random in the set of 7 possible times (from 27ms to 500ms). A perceptual mask followed each photograph immediately. After each presentation subjects reported what they had just seen as completely as possible. In the second stage, another group of sophisticated individuals who were not aware of the goals of the experiment were instructed to score each of the descriptions produced by the subjects in the first stage. Individual scores were assigned to different categories of the content: sensory data (edges, colored patches, etc.), objects, relations between objects, description of the overall scene (gist), completeness and writer's confidence. We find that for most subjects, perception of objects, object relations and overall scenes improves sharply for display times between 50ms and 80ms. For longer display times the quality of their reports improves minimally. We find little evidence to support a temporal order in the report of objects vs. “gist”. The identity of faces of famous people was amongst the last pieces of information to be perceived, with display times typically above 100ms. The presence of animals, people and vehicles was reported early, often with display times below 50ms.

Fei-Fei, L., Koch, C., Iyer, A., Perona, P.(2004). What do we see when we glance at a scene? [Abstract]. Journal of Vision, 4( 8): 863, 863a, http://journalofvision.org/4/8/863/, doi:10.1167/4.8.863. [CrossRef]
 L.F-F is supported by an NSF fellowship and Paul and Daisy Soros Fellowship for New Americans. This project is funded by an NSF grant through the Engineering Research Center at Caltech.

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.