Nowadays, 3D graphics have established their presence on the web -alongside audio and video. In fact, 3D scenes are often used in conjunction with audio and video, to create virtual worlds. However, the diverse nature of these various media components raises synchronization and packaging challenges. In order to address these challenges, we propose packaging 3D scenes, with audio and video, inside MP4 containers. This way, the 3D and other media are delivered as a whole, and on the receiving end, we are able to extract and synchronize the content, from within the browser. In this paper, we explain our methodology, present an end-to-end example scenario, and its associated implementation, using open-source tools.