We study deep bioacoustic event detection through multi-head attention based pooling, exemplified by wildlife monitoring. In the multiple instance learning framework, a core deep neural network learns a projection of the input acoustic signal into a sequence of embeddings, each representing a segment of the input. Sequence pooling is then required to aggregate the information present in the sequence such that we have a single clip-wise representation. We propose an improvement based on Squeeze-and-Excitation mechanisms upon a recently proposed audio tagging ResNet, and show that it performs significantly better than the baseline, as well as a collection of other recent audio models. We then further enhance our model, by performing an extensive comparative study of recent sequence pooling mechanisms, and achieve our best result using multi-head selfattention followed by concatenation of the head-specific pooled embeddings -better than prediction pooling methods, as well as compared to other recent sequence pooling tricks. We perform these experiments on a novel dataset of spider monkey whinny calls we introduce here, recorded in a rainforest in the South-Pacific coast of Costa Rica, with a promising outlook pertaining to minimally invasive wildlife monitoring.
As more land is altered by human activity and more species become at risk of extinction, it is essential that we understand the requirements for conserving threatened species across human-modified landscapes. Owing to their rarity and often sparse distributions, threatened species can be difficult to study and efficient methods to sample them across wide temporal and spatial scales have been lacking. Passive acoustic monitoring (PAM) is increasingly recognized as an efficient method for collecting data on vocal species; however, the development of automated species detectors required to analyse large amounts of acoustic data is not keeping pace. Here, we collected 35 805 h of acoustic data across 341 sites in a region over 1000 km 2 to show that PAM, together with a newly developed automated detector, is able to successfully detect the endangered Geoffroy's spider monkey ( Ateles geoffroyi ), allowing us to show that Geoffroy's spider monkey was absent below a threshold of 80% forest cover and within 1 km of primary paved roads and occurred equally in old growth and secondary forests. We discuss how this methodology circumvents many of the existing issues in traditional sampling methods and can be highly successful in the study of vocally rare or threatened species. Our results provide tools and knowledge for setting targets and developing conservation strategies for the protection of Geoffroy's spider monkey.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.