Spatially squeezed surround audio coding (S3AC) has been previously shown to provide efficient coding with perceptually accurate soundfield reconstruction when applied to ITU 5.1 multichannel audio. This paper investigates the application of S3AC to the coding of Ambisonic audio recordings. Traditional ambisonics achieve compression and backward compatibility through the use of the UHJ matrixing approach to obtain a stereo signal. In this paper the relationship to Ambisonic B-format signals is described and alternative approaches that derive a stereo or mono-downmix signal based on S3AC are presented and evaluated. The mono-downmix approach utilizes side information consisting of spatial cues that are quantized based on novel source localization listening experiments. Objective and subjective tests demonstrate significant improvements in the localization of sound sources resulting from decoding the compressed B-format signals to a 5.1 speaker playback.
A SPATIAL SQUEEZING APPROACH TO AMBISONIC AUDIO COMPRESSION
Bin Cheng, Christian Ritz and Ian BurnettWhisper Laboratories, University of Wollongong, Wollongong, NSW, Australia bc362@uow.edu.au, critz@uow.edu.au, ianb@uow.edu.au
ABSTRACTSpatially Squeezed Surround Audio Coding (S 3 AC) has been previously shown to provide efficient coding with perceptually accurate soundfield reconstruction when applied to ITU 5.1 multichannel audio. This paper investigates the application of S 3 AC to the coding of Ambisonic audio recordings. Traditional Ambisonics achieve compression and backward compatibility through the use of the UHJ matrixing approach to obtain a stereo signal. In this paper the relationship to Ambisonic B-format signals is described and alternative approaches that derive a stereo or mono-downmix signal based on S 3 AC are presented and evaluated. The mono-downmix approach utilizes side information consisting of spatial cues that are quantized based on novel source localization listening experiments. Objective and subjective tests demonstrate significant improvements in the localization of sound sources resulting from decoding the compressed B-format signals to a 5.1 speaker playback.