2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP) 2016
DOI: 10.1109/mlsp.2016.7738875
|View full text |Cite
|
Sign up to set email alerts
|

Bird detection in audio: A survey and a challenge

Abstract: Many biological monitoring projects rely on acoustic detection of birds. Despite increasingly large datasets, this detection is often manual or semi-automatic, requiring manual tuning/postprocessing. We review the state of the art in automatic bird sound detection, and identify a widespread need for tuning-free and species-agnostic approaches. We introduce new datasets and an IEEE research challenge to address this need, to make possible the development of fully automatic algorithms for bird sound detection.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
113
1

Year Published

2017
2017
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 111 publications
(114 citation statements)
references
References 31 publications
0
113
1
Order By: Relevance
“…The Bird Audio Detection challenge [5] consists of a development and an evaluation set. The development set consists of freefield1010 (field recordings gathered by the 1 FreeSound project) and warblr (crowd-sourced recordings collected through smartphone app) datasets, and the evaluation set consists of chernobyl (collected by unattended recorders in Chernobyl exclusion zone) dataset.…”
Section: Datasetsmentioning
confidence: 99%
See 3 more Smart Citations
“…The Bird Audio Detection challenge [5] consists of a development and an evaluation set. The development set consists of freefield1010 (field recordings gathered by the 1 FreeSound project) and warblr (crowd-sourced recordings collected through smartphone app) datasets, and the evaluation set consists of chernobyl (collected by unattended recorders in Chernobyl exclusion zone) dataset.…”
Section: Datasetsmentioning
confidence: 99%
“…The grid search covers each of the combinations of the following hyperparameter values: the number of CNN feature maps/RNN hidden units (the same amount for both) {96, 256}; the number of recurrent layers {1, 2, 3}; and the number of convolutional layers {1, 2, 3 ,4} with the following frequency max pooling arrangements after each convolutional layer {(4), (2, 2), (4, 2), (8,5), (2, 2, 2), (5, 4, 2), (2, 2, 2, 1), (5, 2, 2, 2)}. Here, the numbers denote the number of frequency bands at each max pooling step; e.g., the configuration (5, 4, 2) pools the original 40 bands to one band in three stages: 40 bands → 8 bands → 2 bands → 1 band.…”
Section: Evaluation Metric and Configurationmentioning
confidence: 99%
See 2 more Smart Citations
“…Given that it is relatively easy to collect audio recordings from the field, one must first determine which of these recordings contain a bird sound. This was the task addressed in the recently concluded bird activity detection (BAD) challenge [4], [5]. The challenge provided two datasets with audio recordings labeled as either bird (having a bird sound) and non-bird (having no bird sound.)…”
mentioning
confidence: 99%