Abstract
Ensemble perception is characterized by the rapid ability to estimate a summary statistic from a set without needing serial inspection. But which stimulus properties influence how that summary is made?
In a within-subject experiment with per-trial feedback, subjects chose which set had a larger average value. Using data visualizations as stimuli, subjects were asked which of two sets had a higher position (dot plots), a larger size (floating bar graphs), or redundantly coded highest position and largest size (regular bar graphs). The experiment also varied set size (1vs1, 12vs12, 20vs20, 12vs20, and 20vs12), mean difference between the sets (0 to 80 pixels in 10 pixel increments), and which set had the largest single value. With 25 repetitions per condition, each subjects ran in over 5,000 trials.
For single-item comparisons, position was unsurprisingly more precise than length alone. However, for set comparison, the noisiness of ensemble coding appears to overpower these differences, so position, length, and the redundant combination have indistinguishable discriminability, which contradicts Cleveland & McGill (1984). Moreover, for all visual features, responses were biased towards the larger set size. Previous results (Yuan, Haroz, & Franconeri 2018) suggested that this bias is caused by estimating a sum or total area. But because the effect occurs in the position (dot plot) condition, where sum or total area are unhelpful, that model is unlikely. Additional analyses did not reveal a bias towards the set with the largest single value, the smallest single value, or the largest range of values. These results imply that this bias is holistic and not driven by simpler proxies.
As showing raw data rather than only summary statistics is common advice in visualization design, the set size bias could cause people to misinterpret visualizations that do not have the same number of items in each group.