Abstract-We present the Dynamic Programming Projected Phase-Slope Algorithm (DYPSA) for automatic estimation of glottal closure instants (GCIs) in voiced speech. Accurate estimation of GCIs is an important tool that can be applied to a wide range of speech processing tasks including speech analysis, synthesis and coding. DYPSA is automatic and operates using the speech signal alone without the need for an EGG signal. The algorithm employs the phase-slope function and a novel phase-slope projection technique for estimating GCI candidates from the speech signal. The most likely candidates are then selected using a dynamic programming technique to minimize a cost function that we define. We review and evaluate three existing methods of GCI estimation and compare the new DYPSA algorithm to them. Results are presented for the APLAWD and SAM databases for which 95.7% and 93.1% of GCIs are correctly identified.
We present PEFAC, a fundamental frequency estimation algorithm for speech that is able to identify voiced frames and estimate pitch reliably even at negative signal-to-noise ratios. The algorithm combines a normalization stage, to remove channel dependency and to attenuate strong noise components, with a harmonic summing filter applied in the log-frequency power spectral domain, the impulse response of which is chosen to sum the energy of the fundamental frequency harmonics while attenuating smoothly-varying noise components. Temporal continuity constraints are applied to the selected pitch candidates and a voiced speech probability is computed from the likelihood ratio of two classifiers, one for voiced speech and one for unvoiced speech/silence. We compare the performance of our algorithm with that of other widely used algorithms and demonstrate that it performs well in both high and low levels of additive noise.Index Terms-Fundamental frequency, noisy speech, pitch, speech processing.
2329-9290
Abstract-In this paper, we apply correlation theory methods to obtain a model for the near-carrier oscillator power-spectral density (PSD). Based on the measurement-driven representation of phase noise as a sum of power-law processes, we evaluate closed form expressions for the relevant oscillator autocorrelation functions. These expressions form the basis of an enhanced oscillator spectral model that has a Gaussian PSD at near-carrier frequencies followed by a sequence of power-law regions. New results for the effect of white phase noise, flicker phase noise and random walk frequency modulated phase noise on the near-carrier oscillator PSD are derived. In particular, in the case of 1 phase noise, we show that despite its lack of stationarity it is possible to derive a closed form expression for its effect on an oscillator PSD and show that the oscillator output can be considered to be wide-sense stationary.Index Terms-Correlation theory, frequency noise, Gaussian PSD, Lorentzian PSD, oscillator power-spectral density (PSD), phase noise, power-law process.
We present a Bayesian estimator that performs log-spectrum estimation of both speech and noise, and is used as a Bayesian Kalman filter update step for single-channel speech enhancement in the modulation domain. We use Kalman filtering in the log-power spectral domain rather than in the amplitude or power spectral domains. In the Bayesian Kalman filter update step, we define the posterior distribution of the clean speech and noise log-power spectra as a twodimensional multivariate Gaussian distribution. We utilize a Kalman filter observation constraint surface in the three-dimensional space, where the third dimension is the phase factor. We evaluate the results of the phase-sensitive log-spectrum Kalman filter by comparing them with the results obtained by traditional noise suppression techniques and by an alternative Kalman filtering technique that assumes additivity of speech and noise in the power spectral domain.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.