Automatic Detection and Recognition of Tonal Bird Sounds in Noisy Environments

Jančovič, Peter; Köküer, M.

doi:10.1155/2011/982936

Cited by 50 publications

(35 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Typically, the first stage of an automatic system is to parse the acoustic signal into isolated spectro-temporal segments. This is often performed using an energy-based thresholding that requires an estimate of noise level, e.g., [1], or by decomposition into sinusoidal components [1], [2], [3], [4]. A variety of approaches to feature representation of the spectro-temporal segments and their modelling were explored.…”

Section: Introductionmentioning

confidence: 99%

“…In a case of tonal bird vocalisations, the use of a sinusoidal detection for segmentation also offers a natural way of representing the segment as a temporal sequence of the frequencies of the detected sinusoid, which we refer to as frequency track. This representation was employed in a few earlier studies [1], [6] and also in our recent works [3], [4], [7], [8], [9], [10]. Among the acoustic modelling approaches, the most commonly used are Gaussian mixture models (GMM) [1], [3], hidden Markov models (HMMs) [1], [4], [6], [11], and decision trees [12].…”

Section: Introductionmentioning

confidence: 99%

“…This representation was employed in a few earlier studies [1], [6] and also in our recent works [3], [4], [7], [8], [9], [10]. Among the acoustic modelling approaches, the most commonly used are Gaussian mixture models (GMM) [1], [3], hidden Markov models (HMMs) [1], [4], [6], [11], and decision trees [12]. Several studies focused on detection of specific bird species [13], [14], [15].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Automatic detection of bird species from audio field recordings using HMM-based modelling of frequency tracks

Jančovič

Köküer

2017

2017 25th European Signal Processing Conference (EUSIPCO)

View full text Add to dashboard Cite

Abstract-This paper presents an automatic system for detection of bird species in field recordings. A sinusoidal detection algorithm is employed to segment the acoustic scene into isolated spectro-temporal segments. Each segment is represented as a temporal sequence of frequencies of the detected sinusoid, referred to as frequency track. Each bird species is represented by a set of hidden Markov models (HMMs), each HMM modelling an individual type of bird vocalisation element. These HMMs are obtained in an unsupervised manner. The detection is based on a likelihood ratio of the test utterance against the target bird species and non-target background model. We explore on selection of cohort for modelling the background model, z-norm and t-norm score normalisation techniques and score compensation to deal with outlier data. Experiments are performed using over 40 hours of audio field recordings from 48 bird species plus an additional 16 hours of field recordings as impostor trials. Evaluations are performed using detection error trade-off plots. The equal error rate of 5% is achieved when impostor trials are non-target bird species vocalisations and 1.2% when using field recordings which do not contain bird vocalisations.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Automatic detection of bird species from audio field recordings using HMM-based modelling of frequency tracks

Jančovič

Köküer

2017

2017 25th European Signal Processing Conference (EUSIPCO)

View full text Add to dashboard Cite

show abstract

“…In exception of most of the published work, in [17] waterfall noise was added to bird recordings and it was shown that the recognition of bird sounds in noisy conditions reduces significantly the recognition performance. In this article, we evaluate several different machine learning algorith ms on the task of bird species classificat ion in real-field conditions, under the concept of AMIBIO project (LIFE08-NAT-GR-000539: Automatic Acoustic Monitoring and Inventorying of Biodiversity, Project web-site: http://www.amibio-project.eu/).…”

Section: Introductionmentioning

confidence: 99%

“…Different parametric representations for the bird vocalizations audio signals have been used, among which Mel frequency cepstral coefficients [5,6,16,17] are the most widely used. Other audio features which have been proposed in the literature are the linear predictive coding [16], linear p redictive cepstral coefficients [16], spectral and temporal audio descriptors [12], and tonal-based features [17].…”

Section: Introductionmentioning

confidence: 99%

Integration of Temporal Contextual Information for Robust Acoustic Recognition of Bird Species from Real-Field Data

Mporas¹,

Ganchev²,

Kocsis³

et al. 2013

IJISA

View full text Add to dashboard Cite

Abstract-We report on the development of an automated acoustic bird recognizer with imp roved noise robustness, which is part of a long-term project, aiming at the establishment of an automated biodiversity monitoring system at the Hy mettus Mountain near Athens, Greece. In particu lar, a typical audio p rocessing strategy, which has been proved quite successful in various audio recognition applications, was amended with a simp le and effective mechanis m fo r integration of temporal contextual information in the decisionmaking process. In the present implementation, we consider integration of temporal contextual information by joint post-processing of the recognition results for a number of preceding and subsequent audio frames. In order to evaluate the usefulness of the proposed scheme on the task of acoustic bird recognition, we experimented with six widely used classifiers and a set of real-field audio record ings for two bird species which are present at the Hy mettus Mountain. The highest achieved recognition accuracy obtained on the real-field data was approximately 93%, while experiments with additive noise showed significant robustness in low signal-to-noise ratio setups. In all cases, the integration of temporal contextual informat ion was found to improve the overall accuracy of the recognizer.

show abstract