Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval 2008
DOI: 10.1145/1460096.1460115
|View full text |Cite
|
Sign up to set email alerts
|

Large-scale content-based audio retrieval from text queries

Abstract: In content-based audio retrieval, the goal is to find sound recordings (audio documents) based on their acoustic features. This content-based approach differs from retrieval approaches that index media files using metadata such as file names and user tags.In this paper, we propose a machine learning approach for retrieving sounds that is novel in that it (1) uses free-form text queries rather sound sample based queries, (2) searches by audio content rather than via textual meta data, and (3) can scale to very … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
58
0

Year Published

2009
2009
2022
2022

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 80 publications
(60 citation statements)
references
References 13 publications
2
58
0
Order By: Relevance
“…These range from parametric signal processing-based approaches [7]- [9] to automatic speech recognition (ASR) inspired methods [10] which often make use of mel-frequency cepstral coefficients (MFCCs) [11] and similar features. One promising new approach uses time-frequency domain spectrogram image features (SIF), introduced by Jonathan Dennis et.…”
Section: Robust Sound Event Classification Using Deep Neural Networkmentioning
confidence: 99%
“…These range from parametric signal processing-based approaches [7]- [9] to automatic speech recognition (ASR) inspired methods [10] which often make use of mel-frequency cepstral coefficients (MFCCs) [11] and similar features. One promising new approach uses time-frequency domain spectrogram image features (SIF), introduced by Jonathan Dennis et.…”
Section: Robust Sound Event Classification Using Deep Neural Networkmentioning
confidence: 99%
“…An average spectrum vector is calculated and the derivation of an estimate for tuning derivation is done by stimulating the filter bank shifts using weighted binning techniques [1]. The pitch representation is performed by the decomposition of a given music signal on 88 frequency bands with centre frequencies corresponding to the pitches A0 to C8 (MIDI pitches p=21 to p=108) in order to extract the chroma features.…”
Section: Chromagrammentioning
confidence: 99%
“…Retrieval of multimedia system information is totally different from retrieval of structured information. Music audio retrieval is normally done by annotating the media with text and uses text based retrieval systems to perform music retrieval [1]. But when the music information is voluminous, text based annotation becomes a tedious job.…”
Section: Introductionmentioning
confidence: 99%
“…The BOAW has recently been used for audio document retrieval [35] and copy detection [36], as well as MED tasks [37]. Our recent work [38] describes the basic BOAW approach.…”
Section: Audio Featuresmentioning
confidence: 99%