Query by Example Methods for Audio Signals

Helen, Marko; Lahti, Tommi

doi:10.1109/norsig.2006.275240

Cited by 7 publications

(6 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, the ISA usually yields large computational overheads. The superiority of HMM crosslikelihood ratio has been shown over GMM [75] and feature histograms [33] for class-based QBE. However, these studies exhibited that all approaches are highly sensitive to noise and low-quality sounds.…”

Section: A Content-based Audio Retrievalmentioning

confidence: 99%

“…In real-life conditions, we can expect audio collections to include sounds from different sources recorded under various conditions. Some QBE systems have been tested for robustness but usually only with regards to transcoding, using either lower sampling rates [33] or lossy data compressions [9] to simulate mobile audio databases. We test our approach by applying a wider range of distortion classes to simulate various low-quality conditions in recording…”

Section: E Robustness Analysismentioning

confidence: 99%

See 1 more Smart Citation

Multiobjective Time Series Matching for Audio Classification and Retrieval

Esling

Agón

2013

IEEE Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

Seeking sound samples in a massive database can be a tedious and time consuming task. Even when metadata are available, query results may remain far from the timbre expected by users. This problem stems from the nature of query specification, which does not account for the underlying complexity of audio data. The Query By Example (QBE) paradigm tries to tackle this shortcoming by finding audio clips similar to a given sound example. However, it requires users to have a well-formed soundfile of what they seek, which is not always a valid assumption. Furthermore, most audioretrieval systems rely on a single measure of similarity, which is unlikely to convey the perceptual similarity of audio signals. We address in this paper an innovative way of querying generic audio databases by simultaneously optimizing the temporal evolution of multiple spectral properties. We show how this problem can be cast into a new approach merging multiobjective optimization and time series matching, called MultiObjective Time Series (MOTS) matching. We formally state this problem and report an efficient implementation. This approach introduces a multidimensional assessment of similarity in audio matching. This allows to cope with the multidimensional nature of timbre perception and also to obtain a set of efficient propositions rather than a single best solution. To demonstrate the performances of our approach, we show its efficiency in audio classification tasks. By introducing a selection criterion based on the hypervolume dominated by a class, we show that our approach outstands the state-of-art methods in audio classification even with a few number of features. We demonstrate its robustness to several classes of audio distortions. Finally, we introduce two innovative applications of our method for sound querying.

show abstract

Section: A Content-based Audio Retrievalmentioning

confidence: 99%

Section: E Robustness Analysismentioning

confidence: 99%

Multiobjective Time Series Matching for Audio Classification and Retrieval

Esling

Agón

2013

IEEE Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

show abstract

“…The feature histogram method uses vector quantization to quantize feature vectors, generates feature histograms, and estimates the Euclidean distance between them [2]. The GMM method uses either the EM algorithm or Parzen window method to estimate a GMM for the example and evaluates the likelihood of the database sample.…”

Section: Simulation Experimentsmentioning

confidence: 99%

“…Thus, query by example is usually done in the following way [1,2,3]: first, features are extracted from the example and all the samples in the database. Second, the distances between the feature vectors of the example and the database samples are estimated using a certain distance metric.…”

Section: Introductionmentioning

confidence: 99%

“…Gabbouj et al [4] used a method, where samples were first classified into four main categories and then searched for similar samples only within the samples of the same main category. Helén and Lahti [2] used following three different methods. In the feature histogram method the feature vectors were quantized and the similarity was measured by the distance between their histograms.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Query by Example of Audio Signals using Euclidean Distance Between Gaussian Mixture Models

Helen

Virtanen

2007

2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07

Self Cite

View full text Add to dashboard Cite

Query by example of multimedia signals aims at automatic retrieval of media samples from a database, which are similar to a userprovided example. This paper proposes a method for query by example of audio signals. The method calculates a set of acoustic features from the signals and models their probability density functions (pdfs) using Gaussian mixture models. The method measures the similarity between two samples using the Euclidian distance between their pdfs. A novel method for calculating the closed form solution of the distance is proposed. Simulation experiments show that proposed method enables higher retrieval accuracy than the reference methods.

show abstract