Kanru Hua scite author profile

Kanru Hua

5Publications

7Citation Statements Received

46Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Illinois Urbana-Champaign

Publications

Order By: Most citations

Improving YANGsaf F0 Estimator with Adaptive Kalman Filter

Hua¹

2017

View full text Add to dashboard Cite

We present improvements to the refinement stage of YANGsaf[1] (Yet ANother Glottal source analysis framework), a recently published F0 estimation algorithm by Kawahara et al., for noisy/breathy speech signals. The baseline system, based on time-warping and weighted average of multi-band instantaneous frequency estimates, is still sensitive to additive noise when none of the harmonic provide reliable frequency estimate at low SNR. We alleviate this problem by calibrating the weighted averaging process based on statistics gathered from a Monte-Carlo simulation, and applying Kalman filtering to refined F0 trajectory with time-varying measurement and process distributions. The improved algorithm, adYANGsaf (adaptive Yet ANother Glottal source analysis framework), achieves significantly higher accuracy and smoother F0 trajectory on noisy speech while retaining its accuracy on clean speech, with little computational overhead introduced.

show abstract

Modeling Singing F0 With Neural Network Driven Transition-Sustain Models

Hua¹

2018

Preprint

View full text Add to dashboard Cite

This study focuses on generating fundamental frequency (F0) curves of singing voice from musical scores stored in a midilike notation. Current statistical parametric approaches to singing F0 modeling meet difficulties in reproducing vibratos and the temporal details at note boundaries due to the oversmoothing tendency of statistical models. This paper presents a neural network based solution that models a pair of neighboring notes at a time (the transition model) and uses a separate network for generating vibratos (the sustain model). Predictions from the two models are combined by summation after proper enveloping to enforce continuity. In the training phase, mild misalignment between the scores and the target F0 is addressed by back-propagating the gradients to the networks' inputs. Subjective listening tests on the NITech singing database show that transition-sustain models are able to generate F0 trajectories close to the original performance.

show abstract

Nebula: F0 Estimation and Voicing Detection by Modeling the Statistical Properties of Feature Extractors

Hua¹

2018

View full text Add to dashboard Cite

A F0 and voicing status estimation algorithm for high quality speech analysis/synthesis is proposed. This problem is approached from a different perspective that models the behavior of feature extractors under noise, instead of directly modeling speech signals. Under time-frequency locality assumptions, the joint distribution of extracted features and target F0 can be characterized by training a bank of Gaussian mixture models (GMM) on artificial data generated from Monte-Carlo simulations. The trained GMMs can then be used to generate a set of conditional distributions on the predicted F0, which are then combined and post-processed by Viterbi algorithm to give a final F0 trajectory. Evaluation on CSTR and CMU Arctic speech databases shows that the proposed method, trained on fully synthetic data, achieves lower gross error rates than state-of-the-art methods.

show abstract

Revisiting spectral envelope recovery from speech sounds generated by periodic excitation

Kawahara

Morise

Hua

2018

View full text Add to dashboard Cite

Nebula: F0 Estimation and Voicing Detection by Modeling the Statistical Properties of Feature Extractors

Hua¹

2017

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kanru Hua

Improving YANGsaf F0 Estimator with Adaptive Kalman Filter

Modeling Singing F0 With Neural Network Driven Transition-Sustain Models

Nebula: F0 Estimation and Voicing Detection by Modeling the Statistical Properties of Feature Extractors

Revisiting spectral envelope recovery from speech sounds generated by periodic excitation

Nebula: F0 Estimation and Voicing Detection by Modeling the Statistical Properties of Feature Extractors

Contact Info

Product

Resources

About