2021
DOI: 10.1016/j.ecoinf.2021.101237
|View full text |Cite
|
Sign up to set email alerts
|

Acoustic auto-encoders for biodiversity assessment

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 12 publications
(14 citation statements)
references
References 14 publications
0
14
0
Order By: Relevance
“…An audio segment duration of twelve seconds is unlike the standard practice in the field, which is typically less than five seconds [ 19 ]. Our choice of 12 s segments was influenced by the Short-Time Fourier Transform (STFT) parameters, which yield 515 bins in the frequency axis.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…An audio segment duration of twelve seconds is unlike the standard practice in the field, which is typically less than five seconds [ 19 ]. Our choice of 12 s segments was influenced by the Short-Time Fourier Transform (STFT) parameters, which yield 515 bins in the frequency axis.…”
Section: Methodsmentioning
confidence: 99%
“…The majority of deep learning methodologies for ecoacoustics applications reported in the literature rely on supervised approaches, whereas end-to-end unsupervised frameworks have not been sufficiently studied. However, some approximations have emerged, e.g., Rowe et al [ 19 ] used autoencoders to characterize sound types and identify groups corresponding to species vocalizations. Nevertheless, although the methodology is unsupervised, it requires prior information about the species for validation.…”
Section: Related Workmentioning
confidence: 99%
“…Rather than designing acoustic indices heuristically, deep learning algorithms learn a numerical soundscape description from a given audio set. These can be with pre‐trained models (Sethi et al, 2020) or use self‐supervised methods (Rowe et al, 2021). Learned representations are high dimensional, data‐driven descriptors and, unlike current acoustic indices, not based on human assumptions about links between soundscapes and ecology.…”
Section: Emerging Trends In Ecoacoustic Analysesmentioning
confidence: 99%
“…An opportunity to avoid having to manually choose the right set of features and tune their settings is to use features learnt using deep representation learning (as opposed to handcrafted features like PAFs). Following this approach, auto-encoder artificial neural networks have been used by Goffinet et al [28] on mice and zebra finch vocalisations, by Bergler et al [29] to cluster orca calls, by Rowe et al [30] to cluster bird vocalisations by species, and by Tolkova et al [31] to discriminate between background noise and bird vocalisations.…”
Section: Vocalisations Feature Extraction and Clusteringmentioning
confidence: 99%