2021
DOI: 10.1121/10.0005047
|View full text |Cite
|
Sign up to set email alerts
|

Automatic detection and classification of baleen whale social calls using convolutional neural networks

Abstract: Passive acoustic monitoring has proven to be an indispensable tool for many aspects of baleen whale research. Manual detection of whale calls on these large data sets demands extensive manual labor. Automated whale call detectors offer a more efficient approach and have been developed for many species and call types. However, calls with a large level of variability such as fin whale (Balaenoptera physalus) 40 Hz call and blue whale (B. musculus) D call have been challenging to detect automatically and hence no… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
19
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 30 publications
(19 citation statements)
references
References 37 publications
0
19
0
Order By: Relevance
“…Our results suggest that machine learning algorithms for detection of blue whale D-calls appear now to have advanced to the point where they can perform better, cheaper and more reliably than a human analyst-at least under the circumstances we tested. While other studies, such as (Rasmussen & Sirovi c, 2021) have hinted at this, our double-observer approach not only allowed us to unambiguously determine that the automated detector was superior, but also to better understand and quantify the reasons behind its advantage: superior performance at low and moderate SNR. The high performance and expedient operation will provide tangible improvements to the analysis of large Antarctic datasets such as those from the SOHN (Miller, Milnes, et al, 2021;Van Opzeeland et al, 2013).…”
Section: Discussionmentioning
confidence: 90%
See 2 more Smart Citations
“…Our results suggest that machine learning algorithms for detection of blue whale D-calls appear now to have advanced to the point where they can perform better, cheaper and more reliably than a human analyst-at least under the circumstances we tested. While other studies, such as (Rasmussen & Sirovi c, 2021) have hinted at this, our double-observer approach not only allowed us to unambiguously determine that the automated detector was superior, but also to better understand and quantify the reasons behind its advantage: superior performance at low and moderate SNR. The high performance and expedient operation will provide tangible improvements to the analysis of large Antarctic datasets such as those from the SOHN (Miller, Milnes, et al, 2021;Van Opzeeland et al, 2013).…”
Section: Discussionmentioning
confidence: 90%
“…Initially, it appeared that our DenseNet-automated detector, tested on an independent test dataset, may have had worse performance than other recent D-call detectors (e.g. Rasmussen & Sirovi c, 2021;Socheleau & Samaran, 2017). However, after adjudication and application of mark-recapture methods, it appeared that the performance of our detector was similar to, if not slightly better than, these previous methods.…”
Section: Discussionmentioning
confidence: 97%
See 1 more Smart Citation
“…We use EfficientNet B0, currently the smallest architecture available for off-the-shelf transfer learning (Tan and Le, 2019), to develop a computationally low-cost CNN model for multi-sound source detection,. Other architectures such as ResNet (He et al, 2016), VGG (Simonyan and Zisserman, 2014) and AlexNet (Krizhevsky et al, 2012) are common choices for transfer learning within marine mammal species detection and classification studies (Bergler et al, 2019;Rasmussen & S ̌irovic., 2021;Allen et al, 2021;Lu et al, 2021). These architectures possess more trainable parameters making them computationally more expensive, and studies have shown that larger networks do not always obtain higher accuracies (Bergler et al, 2019).…”
Section: Discussionmentioning
confidence: 99%
“…For example, Padovese et al [79] used image augmentation to generate synthetic calls to increase training data size resulting in increased classifier recall and precision for labeling North Atlantic right whale (Eubalaena glacialis) upcalls. Rasmussen and Širović [80] used scaling and translation augmentation to prevent their classifier from overfitting during the training process. Image augmentation was beneficial in this study because the number of images for training and testing was relatively small (<400 images) for each sound type; in fact, two of the six fish calls had <100 images each for training and testing (Figure 3).…”
Section: Resnet-50 Classifiermentioning
confidence: 99%