Bird detection in audio: A survey and a challenge

Stowell, Dan; Wood, Michael D.; Stylianou, Yannis; Glotin, Hervé

doi:10.1109/mlsp.2016.7738875

Cited by 111 publications

(114 citation statements)

References 31 publications

Supporting

Mentioning

113

Contrasting

Order By: Relevance

“…The Bird Audio Detection challenge [5] consists of a development and an evaluation set. The development set consists of freefield1010 (field recordings gathered by the 1 FreeSound project) and warblr (crowd-sourced recordings collected through smartphone app) datasets, and the evaluation set consists of chernobyl (collected by unattended recorders in Chernobyl exclusion zone) dataset.…”

Section: Datasetsmentioning

confidence: 99%

“…The grid search covers each of the combinations of the following hyperparameter values: the number of CNN feature maps/RNN hidden units (the same amount for both) {96, 256}; the number of recurrent layers {1, 2, 3}; and the number of convolutional layers {1, 2, 3 ,4} with the following frequency max pooling arrangements after each convolutional layer {(4), (2, 2), (4, 2), (8,5), (2, 2, 2), (5, 4, 2), (2, 2, 2, 1), (5, 2, 2, 2)}. Here, the numbers denote the number of frequency bands at each max pooling step; e.g., the configuration (5, 4, 2) pools the original 40 bands to one band in three stages: 40 bands → 8 bands → 2 bands → 1 band.…”

Section: Evaluation Metric and Configurationmentioning

confidence: 99%

“…In this regard, the Bird Audio Detection challenge [5] is organized with an objective to stimulate the research on BAD systems which can work on real life bioacoustics monitoring projects. The challenge provides three bird audio datasets recorded in different acoustic environments.…”

Section: Introductionmentioning

confidence: 99%

“…The final dataset consists of recordings from a different physical environment and it is employed as the evaluation data. An extensive review on the recent work on BAD can also be found in [5].…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Convolutional recurrent neural networks for bird audio detection

Çakır

Adavanne

Parascandolo

et al. 2017

2017 25th European Signal Processing Conference (EUSIPCO)

View full text Add to dashboard Cite

Bird sounds possess distinctive spectral structure which may exhibit small shifts in spectrum depending on the bird species and environmental conditions. In this paper, we propose using convolutional recurrent neural networks on the task of automated bird audio detection in real-life environments. In the proposed method, convolutional layers extract high dimensional, local frequency shift invariant features, while recurrent layers capture longer term dependencies between the features extracted from short time frames. This method achieves 88.5% Area Under ROC Curve (AUC) score on the unseen evaluation data and obtains the second place in the Bird Audio Detection challenge.

show abstract

Section: Datasetsmentioning

confidence: 99%

Section: Evaluation Metric and Configurationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Convolutional recurrent neural networks for bird audio detection

Çakır

Adavanne

Parascandolo

et al. 2017

2017 25th European Signal Processing Conference (EUSIPCO)

View full text Add to dashboard Cite

show abstract

“…Given that it is relatively easy to collect audio recordings from the field, one must first determine which of these recordings contain a bird sound. This was the task addressed in the recently concluded bird activity detection (BAD) challenge [4], [5]. The challenge provided two datasets with audio recordings labeled as either bird (having a bird sound) and non-bird (having no bird sound.)…”

mentioning

confidence: 99%

Rapid bird activity detection using probabilistic sequence kernels

Thakur

Jyothi

Rajan

et al. 2017

2017 25th European Signal Processing Conference (EUSIPCO)

View full text Add to dashboard Cite

Abstract-Bird activity detection is the task of determining if a bird sound is present in a given audio recording. This paper describes a bird activity detector which utilises a support vector machine (SVM) with a dynamic kernel. Dynamic kernels are used to process sets of feature vectors having different cardinalities. Probabilistic sequence kernel (PSK) is one such dynamic kernel. The PSK converts a set of feature vectors from a recording into a fixed-length vector. We propose to use a variant of PSK in this work. Before computing the fixed-length vector, cepstral mean and variance normalisation and short-time Gaussianization is performed on the feature vectors. This reduces environment mismatch between different recordings. Additionally, we also demonstrate a simple procedure to speed up the proposed method by reducing the size of fixed-length vector. A speedup of almost 70% is observed, with a very small drop in accuracy. The proposed method is also compared with a random forest classifier and is shown to outperform it.

show abstract

First assessment of passive acoustics as a tool to monitor the endangered Mediterranean monk seal in the Madeira Archipelago (Portugal)

Muñoz‐Duque,

Vieira,

Fonseca

et al. 2024

Aquatic Conservation

View full text Add to dashboard Cite

The rarest seal and the world's most endangered pinniped species, the Mediterranean monk seal (Monachus monachus), has a small and isolated population in the Madeira Archipelago (Portugal). This species tends to be extremely wary of humans and, therefore, very difficult to approach and study. Passive acoustic monitoring (PAM) is a non‐invasive, cost‐effective tool that can be a valuable complement for the traditional monitoring methods, providing insight for effective conservation of the seal in the Madeira Archipelago. In this pilot study, custom‐designed autonomous underwater recorders were deployed in two marine protected areas (Garajau Partial Nature Reserve and the Desertas Islands Nature Reserve) to assess the potential of PAM to detect and monitor this elusive and endangered species in the Madeira Archipelago. Two call types putatively produced by M. monachus were detected in a subsample of audio files recorded over a 3‐month acoustic deployment; these call types share similarities with the /growl/ and /hiccup/ recently described for M. monachus in a Mediterranean population. The most common sound type detected was the low‐frequency growl. No obvious pattern was found in the abundance of sounds according to sampling date, and no significant difference was found in the abundance of sounds in different periods of the day. The ability to detect the species' underwater vocalizations with PAM opens the possibility of future monitoring plans based on data obtained from audio recordings. These data can provide relevant information for conservation, namely, on the presence and abundance of the seals.

show abstract

Bird detection in audio: A survey and a challenge

Cited by 111 publications

References 31 publications

Convolutional recurrent neural networks for bird audio detection

Convolutional recurrent neural networks for bird audio detection

Rapid bird activity detection using probabilistic sequence kernels

First assessment of passive acoustics as a tool to monitor the endangered Mediterranean monk seal in the Madeira Archipelago (Portugal)

Contact Info

Product

Resources

About