“…to highlight potentially relevant and thus interesting -that is to say "salient" -data, the vagueness and task-dependence of this problem description leads to a variety of models that may differ substantially in which parts of the signal they mark as being of interest. Unfortunately, in contrast to the fast growing amount of proposed visual saliency models (see [10,11]), only few practically applicable models for acoustic attention exist (e.g., [1,6,7]). Most closely related to our work is the model described by Kayser et al [6] which is based on the well-known visual saliency model of Itti et al [12] and, most notably, has been successfully applied to speech processing by Kalinli et al [7] and, in principle, by Lin et al [8] to allow for faster human acoustic event detection through audio visualization.…”