Abstract
All human listeners perceive tones in the presence of regularly repeating patterns of sound pressure fluctuation over a wide range of frequencies. In music, the salient and widely-shared features of this aspect of auditory perception are: 1) an iterated partitioning of the continuous dimension of pitch into octave intervals bounded by tones that are musically similar; 2) the division of each octave into the 12 intervals of the chromatic scale; 3) the preference in musical composition and performance for particular subsets of these 12 intervals (e.g., the intervals of the pentatonic or diatonic scales); and 4) the similar consonance ordering of chromatic scale tone combinations produced by listeners of all ages, places, and periods. Despite intense interest in these perceptual phenomena over several millennia, they have no generally accepted explanation in physical, psychological or biological terms. A rapidly growing body of work in vision has shown that the fundamental qualities that characterize visual percepts (lightness/brightness, color, geometry and motion) accord with the probability distributions of the possible sources of visual stimuli. Since the uncertain provenance of sensory stimuli is general, this empirical solution to the inverse optics problem might be expected to extend to other sensory modalities. We therefore examined the hypothesis that musical percepts also arise from the statistical relationship between sound stimuli and their natural sources. An analysis of recorded speech shows that the probability distribution of amplitude/frequency combinations in human utterances, the principal source of periodic stimuli in the human acoustical environment, predicts octaves, scales and consonance. These observations suggest that the auditory system, like the visual system, generates percepts determined by the probability distributions that link inherently ambiguous stimuli and their sources.