Large bioacoustic archives of wild animals are an important source to identify reappearing communication patterns, which can then be related to recurring behavioral patterns to advance the current understanding of intra-specific communication of non-human animals. A main challenge remains that most large-scale bioacoustic archives contain only a small percentage of animal vocalizations and a large amount of environmental noise, which makes it extremely difficult to manually retrieve sufficient vocalizations for further analysis – particularly important for species with advanced social systems and complex vocalizations. In this study deep neural networks were trained on 11,509 killer whale (
Orcinus orca
) signals and 34,848 noise segments. The resulting toolkit ORCA-SPOT was tested on a large-scale bioacoustic repository – the Orchive – comprising roughly 19,000 hours of killer whale underwater recordings. An automated segmentation of the entire Orchive recordings (about 2.2 years) took approximately 8 days. It achieved a time-based precision or positive-predictive-value (PPV) of 93.2% and an area-under-the-curve (AUC) of 0.9523. This approach enables an automated annotation procedure of large bioacoustics databases to extract killer whale sounds, which are essential for subsequent identification of significant communication patterns. The code will be publicly available in October 2019 to support the application of deep learning to bioaoucstic research. ORCA-SPOT can be adapted to other animal species.
Noise reduction is an important part of modern hearing aids and is included in most commercially available devices. Deep learning-based state-of-the-art algorithms, however, either do not consider real-time and frequency resolution constrains or result in poor quality under very noisy conditions.To improve monaural speech enhancement in noisy environments, we propose CLCNet, a framework based on complex valued linear coding. First, we define complex linear coding (CLC) motivated by linear predictive coding (LPC) that is applied in the complex frequency domain. Second, we propose a framework that incorporates complex spectrogram input and coefficient output. Third, we define a parametric normalization for complex valued spectrograms that complies with low-latency and on-line processing.Our CLCNet was evaluated on a mixture of the EUROM database and a real-world noise dataset recorded with hearing aids and compared to traditional real-valued Wiener-Filter gains.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.