Cetaceans have elicited the attention of researchers in recent decades due to their importance to the ecosystem and their economic values. They use sound for communication, echolocation and other social activities. Their sounds are highly non-stationary, transitory and range from short to long sounds. Passive acoustic monitoring (PAM) is a popular method used for monitoring cetaceans in their ecosystems. The volumes of data accumulated using PAM are usually big, so they are difficult to analyze using manual inspection. Therefore different techniques with mixed outcomes have been developed for the automatic detection and classification of signals of different cetacean species. So far, no single technique developed is perfect to detect and classify the vocalizations of over 82 known species due to variability in time-frequency, difference in the amplitude among species and within species' vocal repertoire, physical environment, among others. The accuracy of any detector or classifier depends on the technique adopted as well as the nature of the signal to be analyzed. In this article, we review the existing techniques for the automatic detection and classification of cetacean vocalizations. We categorize the surveyed techniques, while emphasizing the advantages and disadvantages of these techniques. The article suggests possible research directions that can improve existing detection and classification techniques. In addition, the article recommends other suitable techniques that can be used to analyze non-linear and non-stationary signals such as the cetaceans' signals. Several research have been dedicated to this topic, however, there is no review of these past results that gives a quick overview in the area of cetacean detection and classification. This review will help researchers and practitioners in the field to make insightful decisions based on their requirements.
This letter proposes an empirical mode decomposition (EMD) based hidden Markov model (HMM) approach for the detection of mysticetes' pulse calls such as the Bryde's whales. The HMM detection capabilities depend on the deployed feature extraction (FE) technique. The EMD is proposed as a performance efficient alternative to the popular Mel-scale frequency cepstral coefficient (MFCC) and linear predictive coefficient (LPC) FE techniques. The amplitude modulation–frequency modulation components derived from the EMD process are modified to form feature vectors for the HMM. Also, the ensemble EMD (EEMD) is adapted in a similar way as the EMD. These proposed EMD-HMM and EEMD-HMM approaches achieved better performance in comparison to the MFCC-HMM and LPC-HMM approaches.
Passive acoustic monitoring (PAM) is generally used to extract acoustic signals produced by cetaceans. However, the large data volume from the PAM process is better analyzed using an automated technique such as the hidden Markov models (HMM). In this paper, the HMM is used as a detection and classification technique due to its robustness and low time complexity. Nonetheless, certain parameters, such as the choice of features to be extracted from the signal, the frame duration, and the number of states affect the performance of the model. The results show that HMM exhibits best performances as the number of states increases with short frame duration. However, increasing the number of states creates more computational complexity in the model. The inshore Bryde's whales produce short pulse calls with distinct signal features, which are observable in the time-domain. Hence, a time-domain feature vector is utilized to reduce the complexity of the HMM. Simulation results also show that average power as a time-domain feature vector provides the best performance compared to other feature vectors for detecting the short pulse call of inshore Bryde's whales based on the HMM technique. More so, the extracted features such as the average power, mean, and zero-crossing rate, are combined to form a single 3-dimensional vector (PaMZ). The PaMZ-HMM shows improved performance and reduced complexity over existing feature extraction techniques such as Mel-scale frequency cepstral coefficients (MFCC) and linear predictive coding (LPC). Thus, making the PaMZ-HMM suitable for real-time detection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.