“…More precisely, it is assumed that the feature vectors in each of the two audio segments arise from some probability distribution (e.g., the multivariate Gaussian distribution); then, the distance between the two segments is represented by the dissimilarity between the two distributions. Several distance measures have been proposed, e.g., the Kullback-Leibler distance (KL or KL2) [10], the Generalized Likelihood Ratio (GLR) [9], [14], ∆BIC [11], [15], [16], [13], [17], [18], the Bhattacharyya distance [12], and the XBIC [19]. In addition, some high-level features have been used for audio segmentation; e.g., the spectrum flux and zero-crossing rate (ZCR) [20], [21], and the smoothed zerocrossing rate (SZCR) [22].…”