Considering noise interference often exists in audio processing, it is not robust enough to calculate audio similarity by using distance measure directly. In this paper, basing on Renyi's quadratic entropy a novel scheme for audio similarity measure is proposed . In our work, we extract Mel Frequency Cepstral Coefficients (MFCCs) to represent each audio, and then calculate the similarity based on the entropy of audio samples by probability density function (pdf) of MFCCs which can be estimated by Parzen window. The experimental results show that: (a) our approach has better performance than the one based on Euclidean distance in the common SNR condition, (b) our approach can achieve 94.00% matching accuracy even when the signal to noise ratio (SNR) is 0db. In addition, our algorithm also can be applied in audio retrieval and musical cluster.
Audio fingerprint is an effective representation of an audio signal using low-level features and can be used to identify unlabeled audio based on its content. In this paper, we introduce a robust audio feature, local energy centroid (LEC), which can represent the energy conglomeration degree of the relative small region in the spectrum. Our audio fingerprint is generated based on the LEC feature which is conducive to enhance the robustness of system. In audio retrieval processing, an improved scoring strategy is proposed to resist the linear speed change. Experimental results show that the new fingerprinting system is quite robust in the present of noise and the proposed method can achieve satisfying recognition accuracy.
The compressed format of MP3 is a popular way to store on the personal computers and transmit on internet, but very few algorithms run directly in the compressed domain for noise reduction and it is rather time-consuming to employ traditional methods due to the processing of decompression-noise reductioncompression. In this paper, a novel approach combining with MP3 coding and MDCT coefficients (MDCTs) spectral entropy is proposed to remove directly noise in compressed domain. MDCTs can be extracted from the partial decompression of MP3 audio signal, and then the MDCTs' entropy is calculated in each granule for noise power spectrum estimation. Experimental results show that the proposed method remove the noise effectively in the compressed domain and improve the efficiency of noise reduction for MP3 audio.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.