A novel speech coding algorithm, named pitch synchronous multi band (PSMB), is proposed. It uses the multiband excitation (MBE) model to generate a representative pitch-cycle waveform (peW) for each frame. The representative pcw of a frame is en coded by two out of three codebooks depending upon whether the trame is related or umelated to the previous frame. The new speech coder introduces a pitch-peri ad-based coding feature. The PSMB coder operating at 4 kbps outperforms the Inmarsat 4.15 kbps IMBE coder by a clear margin. It is also fOlmd to be slightly better than the FSlOI6 4.8 kbps code excited linear predictive (CELP) coder in terms of perceptual quality. Fast search algo rithms for the three codebooks used in PSMB are also developed. L INTRODUCTION A cornmon feature of the code-excited linear predictive (CELP) coders [I J is its frame-based procedure. However, the basic nnit of a voiced speech signal wavefoffil is the pitch period in duration. It therefore seems more et11cient to consider the pitch-peTiod-based approach instead of the conventional frame-based approach in de signing a speech coder. The new speech coder includes the pitch period-based feature. It concentrates on the processing of the PCWs, To achieve a fixed bit rate coding, most of the parameters are estimated using a representative PCW generated for each analysis trame, The multi-band excitation (MBE) model [2] is used to generate the pew. After efficiently encoding the PCW, speech signals are synthesized using the method described in [3J. For ado p ting the pitch-period-based feature, the PSMB coder is similar to the prototype waveform interpolation (PWI) coder [4]. a frame speec signals compute spectrum 1----, with FFT estimate refined pitch 1----' period However, there is a major difference. The PSMB coder encodes both voiced and unvoiced signals whereas the PWl coder encodes voiced signals only. Although PCW is only meaningful for voicedspeech signals, the method of generating pews and also the scheme of encoding pews (to be discussed shortly) make it suit able for the new speech coder to process unvoiced speech signals as welL The PSMB coder is also related to the MBE model be cause pews are obtained by using multi-band analysis. However, some of the weaknesses of the MBE model have been overcome [5J in the proposed coder by encoding the PCW \Vith a closed-loop analysis-by-synthesis CABS) stmcture.
PSMB SPEECH CODERThe basic structure of the PSMB coder is presented in Fig.I. A re fmed pitch period which is obtained as described in [3 J, is used to divide the spectrum into several bands, each of which encom passes one harmonic. The voicedll11l voiced (v/uv) decision is made for each group of three bands, A single pew can be gener ated as : w(i) = I,M} cos 21[-+ 8} I. (if ) }=l P i=O,I,»>,P-I (1)where Mj is the magnitude and 8} is the phase off-th band, P is the integer part of the refm ed pitch period and L is the number of bands, For simplicity, the integer part of the refi ned pitch period will be referred to as the pitch period,The...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.