Automatic phrase detection systems of bird sounds are useful in several applications as they reduce the need for manual annotations. However, birdphrase detection is challenging due to limited training data and background noise. Limited data occur because of limited recordings or the existence of rare phrases. Background noise interference occurs because of the intrinsic nature of the recording environment such as wind or other animals. This paper presents a different approach to birdsong phrase classification using template-based techniques suitable even for limited training data and noisy environments. The algorithm utilizes dynamic time-warping (DTW) and prominent (high-energy) time-frequency regions of training spectrograms to derive templates. The performance of the proposed algorithm is compared with the traditional DTW and hidden Markov models (HMMs) methods under several training and test conditions. DTW works well when the data are limited, while HMMs do better when more data are available, yet they both suffer when the background noise is severe. The proposed algorithm outperforms DTW and HMMs in most training and testing conditions, usually with a high margin when the background noise level is high. The innovation of this work is that the proposed algorithm is robust to both limited training data and background noise.
Orodispersible films (ODFs) are an attractive delivery system for a myriad of clinical applications and possess both large economical and clinical rewards. However, the manufacturing of ODFs does not adhere to contemporary paradigms of personalised, on-demand medicine, nor sustainable manufacturing. To address these shortcomings, both three-dimensional (3D) printing and machine learning (ML) were employed to provide on-demand manufacturing and quality control checks of ODFs. Direct ink writing (DIW) was able to fabricate complex ODF shapes, with thicknesses of less than 100 µm. ML algorithms were explored to classify the ODFs according to their active ingredient, by using their near-infrared (NIR) spectrums. A supervised model of linear discriminant analysis was found to provide 100% accuracy in classifying ODFs. A subsequent partial least square algorithm was applied to verify the dose, where a coefficient of determination of 0.96, 0.99 and 0.98 was obtained for ODFs of paracetamol, caffeine, and theophylline, respectively. Therefore, it was concluded that the combination of 3D printing, NIR and ML can result in a rapid production and verification of ODFs. Additionally, a machine vision tool was used to automate the in vitro testing. These collective digital technologies demonstrate the potential to automate the ODF workflow.
Pitch is an important property of birdsong. Accurate and automatic tracking of pitch for large numbers of recordings would be useful for automatic analysis of birdsong. Currently, pitch trackers such as YIN can work with carefully tuned parameters but the characteristics of birdsong mean those optimal parameters can change quickly even within a single song. This paper presents YIN-bird, a modified version of YIN which exploits spectrogram properties to automatically set a minimum fundamental frequency parameter for YIN. This parameter is continuously updated without user intervention. A ground truth dataset of synthetic birdsong with known fundamental frequency is generated for evaluation of YIN-bird. Listener tests from expert birders described the synthetic samples as "sounding like original & can hardly tell it is synthetic". Gross pitch error on whistles and trills were reduced by up to 4%. An analysis of nasal sounds shows the challenge in accurate pitch tracking for this syllable type.
Over time, a bird population's acoustic and morphological features can diverge from the parent species. A quantitative measure of difference between two populations of species/subspecies is extremely useful to zoologists. Work in this paper takes a dialect difference system first developed for speech and refines it to automatically measure vocalisation difference between bird populations by extracting pitch contours. The pitch contours are transposed into pitch codes. A variety of codebook schemes are proposed to represent the contour structure, including a vector quantization approach. The measure, called Bird Vocalisation Difference, is applied to bird populations with calls that are considered very similar, very different, and between these two extremes. Initial results are very promising, with the behaviour of the metric consistent with accepted levels of similarity for the populations tested to date. The influence of data size on the measure is investigated by using reduced datasets. Results of species pair classification using Gaussian mixture models with Mel-frequency cepstral coefficients is also given as a baseline indicator of class confusability.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.