Lee Ngee Tan scite author profile

Annotation of phrases in birdsongs can be helpful to behavioral and population studies. To reduce the need for manual annotation, an automated birdsong phrase classification algorithm for limited data is developed. Limited data occur because of limited recordings or the existence of rare phrases. In this paper, classification of up to 81 phrase classes of Cassin's Vireo is performed using one to five training samples per class. The algorithm involves dynamic time warping (DTW) and two passes of sparse representation (SR) classification. DTW improves the similarity between training and test phrases from the same class in the presence of individual bird differences and phrase segmentation inconsistencies. The SR classifier works by finding a sparse linear combination of training feature vectors from all classes that best approximates the test feature vector. When the class decisions from DTW and the first pass SR classification are different, SR classification is repeated using training samples from these two conflicting classes. Compared to DTW, support vector machines, and an SR classifier without DTW, the proposed classifier achieves the highest classification accuracies of 94% and 89% on manually segmented and automatically segmented phrases, respectively, from unseen Cassin's Vireo individuals, using five training samples per class.

show abstract

Voice activity detection using harmonic frequency components in likelihood ratio test

Tan

Borgström

Alwan

2010

View full text Add to dashboard Cite

This paper proposes a new statistical model-based likelihood ratio test (LRT) VAD to obtain reliable speech / non-speech decisions. In the proposed method, the likelihood ratio (LR) is calculated differently for voiced frames, as opposed to unvoiced frames: only DFT bins containing harmonic spectral peaks are selected for LR computation. To evaluate the new VAD's effectiveness in improving the noiserobustness of ASR, its decisions are applied to preprocessing techniques such as non-linear spectral subtraction, minimum mean square error short-time spectral amplitude estimator, and frame dropping. From the ASR experiments conducted on the Aurora2 database, the proposed harmonic frequency-based LRTs give better results than conventional LRT-based VADs and the standard G.729B and ETSI AMR VADs.

show abstract

A robust automatic bird phrase classifier using dynamic time-warping with prominent region identification

Kaewtip

Tan

Alwan

et al. 2013

View full text Add to dashboard Cite

In this paper, we present a novel approach to birdsong phase classification using template-based techniques suitable even for limited training data and noisy environments. The algorithm utilizes dynamic time-warping and prominent (high-energy) time frequency regions of training spectrograms to derive templates. The algorithm is evaluated on 32 classes of Cassin's Vireo bird phrases. Using only three training examples per class, our algorithm yields a phrase accuracy of 96.23%, outperforming other classifiers (e.g. 85.21 % classification accuracy of SVM). In the presence of additive noise (10 dB SNR degradation), the proposed classifier does not degrade significantly, compared to others.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Lee Ngee Tan

A sparse representation-based classifier for in-set bird phrase verification and classification with limited training data

Multi-band summary correlogram-based pitch detection for noisy speech

Dynamic time warping and sparse representation classification for birdsong phrase classification using limited training data

Voice activity detection using harmonic frequency components in likelihood ratio test

A robust automatic bird phrase classifier using dynamic time-warping with prominent region identification

Contact Info

Product

Resources

About