High rates of biodiversity loss caused by human-induced changes in the environment require new methods for large scale fauna monitoring and data analysis. While ecoacoustic monitoring is increasingly being used and shows promise, analysis and interpretation of the big data produced remains a challenge. Computer-generated acoustic indices potentially provide a biologically meaningful summary of sound, however, temporal autocorrelation, difficulties in statistical analysis of multi-index data and lack of consistency or transferability in different terrestrial environments have hindered the application of those indices in different contexts. To address these issues we investigate the use of time-series motif discovery and random forest classification of multi-indices through two case studies. We use a semi-automated workflow combining time-series motif discovery and random forest classification of multi-index (acoustic complexity, temporal entropy, and events per second) data to categorize sounds in unfiltered recordings according to the main source of sound present (birds, insects, geophony). Our approach showed more than 70% accuracy in label assignment in both datasets. The categories assigned were broad, but we believe this is a great improvement on traditional single index analysis of environmental recordings as we can now give ecological meaning to recordings in a semi-automated way that does not require expert knowledge and manual validation is only necessary for a small subset of the data. Furthermore, temporal autocorrelation, which is largely ignored by researchers, has been effectively eliminated through the time-series motif discovery technique applied here for the first time to ecoacoustic data. We expect that our approach will greatly assist researchers in the future as it will allow large datasets to be rapidly processed and labeled, enabling the screening of recordings for undesired sounds, such as wind, or target biophony (insects and birds) for biodiversity monitoring or bioacoustics research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.