Salience describes the phenomenon by which an object stands out from a scene. While its
underlying processes are extensively studied in vision, mechanisms of auditory salience
remain largely unknown. Previous studies have used well-controlled auditory scenes to shed
light on some of the acoustic attributes that drive the salience of sound events.
Unfortunately, the use of constrained stimuli in addition to a lack of well-established
benchmarks of salience judgments hampers the development of comprehensive theories of
sensory-driven auditory attention. The present study explores auditory salience in a set
of dynamic natural scenes. A behavioral measure of salience is collected by having human
volunteers listen to two concurrent scenes and indicate continuously which one attracts
their attention. By using natural scenes, the study takes a data-driven rather than
experimenter-driven approach to exploring the parameters of auditory salience. The
findings indicate that the space of auditory salience is multidimensional (spanning
loudness, pitch,
spectral shape, as well as other acoustic attributes), nonlinear and highly
context-dependent. Importantly, the results indicate that contextual information about the entire
scene over both short and long scales needs to be considered in order to properly account
for perceptual judgments of salience.