Fundamental frequency (F0) estimation for quasiharmonic signals is an important task in music signal processing. Many previously developed techniques have suffered from unsatisfactory performance due to ambiguous spectra, noise perturbations, wide frequency range, vibrato, and other common artifacts encountered in musical signals. In this paper a new two-way mismatch (TWM) procedure for estimating F0 is described which may lead to improved results in this area. This computer-based method uses the quasiharmonic assumption to guide a search for F0 based on the short-time spectra of an input signal. The estimated F0 is chosen to minimize discrepancies between measured partial frequencies and harmonic frequencies generated by trial values of F0. For each trial F0, mismatches between the harmonics generated and the measured partial frequencies are averaged over a fixed subset of the available partials. A weighting scheme is used to reduce the susceptibility of the procedure to the presence of noise or absence of certain partials in the spectral data. Graphs of F0 estimate versus time for several representative recorded solo musical instrument and voice passages are presented. Some special strategies for extending the TWM procedure for F0 estimations of two simultaneous voices in duet recordings are also discussed.
The perceptual salience of several outstanding features of quasiharmonic, time-variant spectra was investigated in musical instrument sounds. Spectral analyses of sounds from seven musical instruments (clarinet, flute, oboe, trumpet, violin, harpsichord, and marimba) produced time-varying harmonic amplitude and frequency data. Six basic data simplifications and five combinations of them were applied to the reference tones: amplitude-variation smoothing, coherent variation of amplitudes over time, spectral-envelope smoothing, forced harmonic-frequency variation, frequency-variation smoothing, and harmonic-frequency flattening. Listeners were asked to discriminate sounds resynthesized with simplified data from reference sounds resynthesized with the full data. Averaged over the seven instruments, the discrimination was very good for spectral envelope smoothing and amplitude envelope coherence, but was moderate to poor in decreasing order for forced harmonic frequency variation, frequency variation smoothing, frequency flattening, and amplitude variation smoothing. Discrimination of combinations of simplifications was equivalent to that of the most potent constituent simplification. Objective measurements were made on the spectral data for harmonic amplitude, harmonic frequency, and spectral centroid changes resulting from simplifications. These measures were found to correlate well with discrimination results, indicating that listeners have access to a relatively fine-grained sensory representation of musical instrument sounds.
The time-varying spectra of eight musical instrument sounds were randomly altered by a time-invariant process to determine how detection of spectral alteration varies with degree of alteration, instrument, musical experience, and spectral variation. Sounds were resynthesized with centroids equalized to the original sounds, with frequencies harmonically flattened, and with average spectral error levels of 8%, 16%, 24%, 32%, and 48%. Listeners were asked to discriminate the randomly altered sounds from reference sounds resynthesized from the original data. For all eight instruments, discrimination was very good for the 32% and 48% error levels, moderate for the 16% and 24% error levels, and poor for the 8% error levels. When the error levels were 16%, 24%, and 32%, the scores of musically experienced listeners were found to be significantly better than the scores of listeners with no musical experience. Also, in this same error level range, discrimination was significantly affected by the instrument tested. For error levels of 16% and 24%, discrimination scores were significantly, but negatively correlated with measures of spectral incoherence and normalized centroid deviation on unaltered instrument spectra, suggesting that the presence of dynamic spectral variations tends to increase the difficulty of detecting spectral alterations. Correlation between discrimination and a measure of spectral irregularity was comparatively low.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.