This paper proposes a new aliasing cancelation algorithm for the transition between non-aliased coding and transform coding with time domain aliasing cancelation (TDAC). It is effectively utilized for unified speech and audio coding (USAC) that was recently standardized by the Moving Picture Experts Group (MPEG). Since the USAC combines two coding methods with totally different structures, a special processing called forward aliasing cancelation (FAC) is needed at the transition region. Unlike the FAC algorithm embedded in the current standard, the proposed algorithm does not require additional bits to encode aliasing cancelation terms because it appropriately utilizes adjacent decoded samples. Consequently, around 5% of total bits are saved at 16-and 24-kbps operating modes in speech-like signals. The proposed algorithm is sophisticatedly integrated on the decoding module of the USAC common encoder (JAME) for performance verification, which follows the standard process exactly. Both objective and subjective experimental results confirm the feasibility of the proposed algorithm, especially for contents that require a high percentage of mode switching.
This paper proposes audio coding using an efficient long-term prediction method to enhance the perceptual quality of audio codecs to speech input signals at low bit-rates. The MPEG-4 AAC-LTP exploited a similar concept, but its improvement was not significant because of small prediction gain due to long prediction lags and aliased components caused by the transformation with a time-domain aliasing cancelation (TDAC) technique. The proposed algorithm increases the prediction gain by employing a deharmonizing predictor and a long-term compensation filter. The look-back memory elements are first constructed by applying the de-harmonizing predictor to the input signal, then the prediction residual is encoded and decoded by transform audio coding. Finally, the long-term compensation filter is applied to the updated look-back memory of the decoded prediction residual to obtain synthesized signals. Experimental results show that the proposed algorithm has much lower spectral distortion and higher perceptual quality than conventional approaches especially for harmonic signals, such as voiced speech.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.