The perception of sound textures, a class of natural sounds defined by statistical sound structure such as fire wind, and rain, has been proposed to arise through the integration of time-averaged summary statistics. Where and how the auditory system might encode these summary statistics to create internal representations of these stationary sounds, however, is unknown. Here, using natural textures and synthetic variants with reduced statistics, we show that summary statistics modulate the correlations between frequency organized neuron ensembles in the awake rabbit inferior colliculus. These neural ensemble correlation statistics capture high-order sound structure and allow for accurate neural decoding in a single trial recognition task with evidence accumulation times approaching 1 s. In contrast, the average activity across the neural ensemble (neural spectrum) provides a fast (tens of ms) and salient signal that contributes primarily to texture discrimination. Intriguingly, perceptual studies in human listeners reveals analogous trends: the sound spectrum is integrated quickly and serves as salient discrimination cue while high-order sound statistics are integrated slowly and contribute substantially more towards recognition. The findings suggest statistical sound cues such as the sound spectrum and correlation structure are represented by distinct response statistics in auditory midbrain ensembles, and that these neural response statistics may have dissociable roles and time scales for the recognition and discrimination of natural sounds.SIGNIFICANCE STATEMENTBeing able to recognize and discriminate natural sounds, such as from a running stream, a crowd clapping, or ruffling leaves is a critical task of the normal functioning auditory system. Humans can easily perform such tasks, yet they can be particularly difficult for the hearing impaired and they challenge our most sophisticated computer algorithms. This difficulty is attributed to the complex physical structure of such natural sounds and the fact they are not unique: they vary randomly in a statistically defined manner from one excerpt to the other. Here we provide the first evidence, to our knowledge, that the central auditory system is able to encode and utilize statistical sound cues for natural sound recognition and discrimination behaviors.