Serious games are gaining in popularity within a wide range of educational and training applications given their ability to engage and motivate learners in the educational process. Recent hardware and computational advancements are providing developers the opportunity to develop applications that employ a high level of fidelity (realism) and novel interaction techniques. However, despite these great advances in hardware and computational power, real-time high fidelity rendering of complex virtual environments (found in many serious games) across all modalities is still not feasible. Perceptual-based rendering exploits various aspects of the multi-modal perceptual system to reduce computational requirements without any resulting perceptual effects on the resulting scene. A series of human-based experiments demonstrated a potentially strong effect of sound on visual fidelity perception, and task performance. However, the resulting effects were subjective whereby the influence of sound was dependent on various individual factors including musical listening preferences. This suggests the importance of customizing (individualizing) a serious game's virtual environment with respect to audio-visual fidelity, background sounds, etc. In this paper details regarding this series of audio-visual experiments will be provided followed by a description of current work that is examining the customization of a serious game's virtual environment by each user through the use of a game-based calibration method.