We use speaker normalization for vocoding the speech of a new input speaker by using a speaker dependent segment vocoder operating at a very-low bit rate, below 300 b/s. The normalization consists of a spectral transformation, applied on the spectral parameter vector of the reference speaker, which should improve the match between the reference and input speakers. The optimal spectral transformation is determined by an iterative algorithm that is guaranteed to converge to a local optimum, i.e., the quantization error of the segment vocoder is minimized by the normalization algorithm. We demonstrate the general algorithm by deriving a linear least squares solution for the spectral transformation. We present some results on several male and female speakers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.