Although field-collected recordings typically contain multiple simultaneously vocalizing birds of different species, acoustic species classification in this setting has received little study so far. This work formulates the problem of classifying the set of species present in an audio recording using the multi-instance multi-label (MIML) framework for machine learning, and proposes a MIML bag generator for audio, i.e., an algorithm which transforms an input audio signal into a bag-of-instances representation suitable for use with MIML classifiers. The proposed representation uses a 2D time-frequency segmentation of the audio signal, which can separate bird sounds that overlap in time. Experiments using audio data containing 13 species collected with unattended omnidirectional microphones in the H. J. Andrews Experimental Forest demonstrate that the proposed methods achieve high accuracy (96.1% true positives/negatives). Automated detection of bird species occurrence using MIML has many potential applications, particularly in long-term monitoring of remote sites, species distribution modeling, and conservation planning.
Recent work in machine learning considers the problem of identifying bird species from an audio recording. Most methods require segmentation to isolate each syllable of bird call in input audio. Energy-based time-domain segmentation has been successfully applied to low-noise, single-bird recordings. However, audio from automated field recorders contains too much noise for such methods, so a more robust segmentation method is required. We propose a supervised timefrequency audio segmentation method using a Random Forest classifier, to extract syllables of bird call from a noisy signal. When applied to a test data set of 625 field-collected audio segments, our method isolates 93.6% of the acoustic energy of bird song with a false positive rate of 8.6%, outperforming energy thresholding.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.