At bit rates between 4 and 16 kbit/s, many state‐of‐the‐art speech coding algorithms fall into the class of linear‐prediction based analysis‐by‐synthesis (LPAS) speech coders. At the lower bit rates the waveform‐matching, on which LPAS coders rely, constrains the speech quality. To overcome this drawback, we present a coder (RCELP) that uses a generalization of the analysis‐by‐synthesis paradigm. This generalization relaxes the waveform‐matching constraints without affecting speech quality. We describe several implementations at bit rates between 4 and 6 kbit/s. MOS tests show that a 6 kbit/s RCELP has a quality similar or better than the 13 kbit/s GSM full‐rate coder, and a 4.4 kbit/s RCELP has a speech quality significantly better than the 4.8 kbit/s FS1016 standard.
In speech coders, accurate modeling of the level of periodicity of speech is essential for good quality reconstruction. The generalized analysis-by-synthesis principle used in RCELP allows particularly efficient modeling of this speech attribute. We describe (1) refinements of the generalized analysis-by-synthesis implementation and (2) a procedure which exploits the low update rate of the pitch period in RCELP to enhance performance in the case of frame erasures. MOS test results confirm that an 8 kb/s RCELP coder using these principles provides excellent performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.