September 2018
Volume 18, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2018
Do you see how many I see? Quantifying human crowd counting accuracy over natural scenes
Author Affiliations
  • Logan Blake
    University of Central Florida, Burnett School of Biomedical Sciences
  • Ali Borji
    University of Central Florida, Department of Computer Science
Journal of Vision September 2018, Vol.18, 850. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Logan Blake, Ali Borji; Do you see how many I see? Quantifying human crowd counting accuracy over natural scenes. Journal of Vision 2018;18(10):850.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Previous research (e.g., Cicchini et al., 2016, Lee et al., 2016) has shown that humans can reliably estimate the number of items in simple synthetic arrays (a.k.a numerosity). However, the extent to which this capacity generalizes to complex realistic scenes remains unknown (e.g., presidential inaugural photos). Here, we aim to quantify the accuracy of subjects in crowd counting. During the experiment, images are presented to subjects at short intervals of 5, 1, or 0.5 seconds (shuffled presentation; one time interval at a time). The subject must then report the number of people present in the crowd (discretized into 5 categories: 1-1K, 1K-2K, 2K-3K, 3K-4K, and 4K-5K) by pressing a corresponding key on the keyboard. Each image is succeeded by a white noise masking stimulus shown for 1 second, and a blank screen which remains for 10 seconds or until key press. Each category consists of 14 images that cover the whole crowd range in that category. Subjects were 12 undergraduates (6 male, 6 female) between 18 and 26 years old and had normal or corrected to normal vision. Analysis of the data shows that a) Average accuracy is significantly above chance (33% vs. 20%). Subjects are better over images with less than 1K people (55%), followed by images with more than 4K people (46%). The middle categories pose the most difficulty to subjects. In such cases, subjects are off by only one unit, and b) The more time, the better estimation accuracy. The average accuracy drops significantly with less presentation time (33.8%, 31.2%, and 27.85% for 5, 1, and 0.5 seconds, respectively). The drop is more severe going from 1 to 0.5 seconds. Our results show that humans are able to estimate numerosity over naturalistic stimuli with many items.

Meeting abstract presented at VSS 2018


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.