“…Typical examples include speech recognition and perceptual enhancement [5][6][7][8], speaker indexing and diarization [14][15][16][17][18][19], voice/music detection and discrimination [1][2][3][4][9][10][11][12][13][20][21][22], information retrieval and genre classification of music [23,24], audio-driven alignment of multiple recordings [25,26], sound emotion recognition [27][28][29] and others [10,[30][31][32]. Concerning the media production and broadcasting domain, audio and audio-driven segmentation allow for the implementation of proper archiving, thus facilitating content reuse scenarios [1- 3,11,12,[16][17][18][19][20]31]. Besides internal (within the media organization) searching and retrieval, publicity metrics of specific radio stations and programs can be associated with the presence of various audio classes (of both speakers and music species), providing valuable feedback to all involved in the broadcasting process (i.e., producers, advertisers, communication professionals, journalists, etc.)…”