Dictionary extraction from a collection of spectrograms for bioacoustics monitoring

Ruiz-Muñoz, José Francisco; You, Zeyu; Raich, Raviv; Fern, Xiaoli Z.

doi:10.1109/mlsp.2015.7324357

Cited by 3 publications

(4 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In experiments, we present an application of the proposed approach for (i) denoising spectrograms, which are corrupted by rain noise, (ii) unsupervised bird syllable discovery and (iii) supervised classification of birdsong recordings. This paper extends our work in [24] to include detailed derivations as well as a multi-label classification framework for the proposed dictionary learning approach.…”

Section: Introductionmentioning

confidence: 82%

“…. , D K } and the sparse Figure 1 Reproduction of a convolutive model for dictionary learning [24]. This illustration shows how the elements Y i (f, t) of a spectrogram are computed by applying the convolution operation between the elements of the dictionary words d 1 (t, f ), d 2 (t, f ) and d 3 (t, f ), and the activation signals a i 1 (t), a i 2 (t) and a i 3 (t).…”

Section: Problem Formulationmentioning

confidence: 99%

See 1 more Smart Citation

Dictionary Learning for Bioacoustics Monitoring with Applications to Species Classification

Ruiz-Muñoz

You

Raich

et al. 2016

J Sign Process Syst

Self Cite

View full text Add to dashboard Cite

This paper deals with the application of the convolutive version of dictionary learning to analyze insitu audio recordings for bio-acoustics monitoring. We propose an efficient approach for learning and using a sparse convolutive model to represent a collection of spectrograms. In this approach, we identify repeated bioacoustics patterns, e.g., bird syllables, as words and represent new spectrograms using these words. Moreover, we propose a supervised dictionary learning approach in the multiple-label setting to support multi-label classification of unlabeled spectrograms. Our approach relies on a random projection for reduced computational complexity. As a consequence, the non-negativity requirement on the dictionary words is relaxed. Furthermore, the proposed approach is well-suited for a collection of discontinuous spectrograms. We evaluate our approach on synthetic examples and on two real datasets consisting of multiple birds audio recordings. Bird syllable dictionary learning from a real-world dataset is demonstrated.Additionally, we successfully apply the approach to spectrogram denoising and species classification.

show abstract

Section: Introductionmentioning

confidence: 82%

Section: Problem Formulationmentioning

confidence: 99%

Dictionary Learning for Bioacoustics Monitoring with Applications to Species Classification

Ruiz-Muñoz

You

Raich

et al. 2016

J Sign Process Syst

Self Cite

View full text Add to dashboard Cite

show abstract

“…In order to provide a benchmark, we considered a two-step approach: a generative convolutive dictionary learning method followed by a classifier. 3 For the implementation of the generative dictionary learning method, we chose [49] (used previously on the HJA dataset) and constructed a generative dictionary D = {d 1 , d 2 , . .…”

Section: B Synthetic Datasets and Settingsmentioning

confidence: 99%

“…Using the 10 MC runs, we evaluated the proposed GDL-LR approach by trained on a fixed number of 5000 outer iterations as in [49]. We vary the dictionary window size T d ∈ {5, 10, 20, 40, 60, 80}, sparsity regularization λ s ∈ {10 −8 , 10 −6 , 10 −4 , 10 −2 , 10 0 , 10 2 } and the number of dictio-2 These class templates are defined by selecting the most frequent 3 patterns in the generated 2-D signals.…”

Section: B Synthetic Datasets and Settingsmentioning

confidence: 99%

Weakly Supervised Dictionary Learning

You

Raich

Fern

et al. 2018

IEEE Trans. Signal Process.

Self Cite

View full text Add to dashboard Cite

We present a probabilistic modeling and inference framework for discriminative analysis dictionary learning under a weak supervision setting. Dictionary learning approaches have been widely used for tasks such as low-level signal denoising and restoration as well as high-level classification tasks, which can be applied to audio and image analysis. Synthesis dictionary learning aims at jointly learning a dictionary and corresponding sparse coefficients to provide accurate data representation. This approach is useful for denoising and signal restoration, but may lead to sub-optimal classification performance. By contrast, analysis dictionary learning provides a transform that maps data to a sparse discriminative representation suitable for classification. We consider the problem of analysis dictionary learning for timeseries data under a weak supervision setting in which signals are assigned with a global label instead of an instantaneous label signal. We propose a discriminative probabilistic model that incorporates both label information and sparsity constraints on the underlying latent instantaneous label signal using cardinality control. We present the expectation maximization (EM) procedure for maximum likelihood estimation (MLE) of the proposed model. To facilitate a computationally efficient E-step, we propose both a chain and a novel tree graph reformulation of the graphical model. The performance of the proposed model is demonstrated on both synthetic and real-world data.

show abstract