Figure 1: Our hybrid technique is able to model high-fidelity acoustic effects for large, complex indoor or outdoor scenes at interactive rates: (a) building surrounded by walls, (b) underground parking garage, and (c) reservoir scene in Half-Life 2. AbstractWe present a novel hybrid approach that couples geometric and numerical acoustic techniques for interactive sound propagation in complex environments. Our formulation is based on a combination of spatial and frequency decomposition of the sound field. We use numerical wave-based techniques to precompute the pressure field in the near-object regions and geometric propagation techniques in the far-field regions to model sound propagation. We present a novel two-way pressure coupling technique at the interface of nearobject and far-field regions. At runtime, the impulse response at the listener position is computed at interactive rates based on the stored pressure field and interpolation techniques. Our system is able to simulate high-fidelity acoustic effects such as diffraction, scattering, low-pass filtering behind obstruction, reverberation, and high-order reflections in large, complex indoor and outdoor environments and Half-Life 2 game engine. The pressure computation requires orders of magnitude lower memory than standard wavebased numerical techniques.
Linear modal synthesis methods have often been used to generate sounds for rigid bodies. One of the key challenges in widely adopting such techniques is the lack of automatic determination of satisfactory material parameters that recreate realistic audio quality of sounding materials. We introduce a novel method using pre-recorded audio clips to estimate material parameters that capture the inherent quality of recorded sounding materials. Our method extracts perceptually salient features from audio examples. Based on psychoacoustic principles, we design a parameter estimation algorithm using an optimization framework and these salient features to guide the search of the best material parameters for modal synthesis. We also present a method that compensates for the differences between the realworld recording and sound synthesized using solely linear modal synthesis models to create the final synthesized audio. The resulting audio generated from this sound synthesis pipeline well preserves the same sense of material as a recorded audio example. Moreover, both the estimated material parameters and the residual compensation naturally transfer to virtual objects of different sizes and shapes, while the synthesized sounds vary accordingly. A perceptual study shows the results of this system compares well with real-world recordings in terms of material perception.
The recent emergence of machine-learning based generative models for speech suggests a significant reduction in bit rate for speech codecs is possible. However, the performance of generative models deteriorates significantly with the distortions present in real-world input signals. We argue that this deterioration is due to the sensitivity of the maximum likelihood criterion to outliers and the ineffectiveness of modeling a sum of independent signals with a single autoregressive model. We introduce predictive-variance regularization to reduce the sensitivity to outliers, resulting in a significant increase in performance. We show that noise reduction to remove unwanted signals can significantly increase performance. We provide extensive subjective performance evaluations that show that our system based on generative modeling provides state-of-the-art coding performance at 3 kb/s for real-world speech signals at reasonable computational complexity.
The physical world consists of spatially varying media, such as the atmosphere and the ocean, in which light and sound propagates along non-linear trajectories. This presents a challenge to existing ray-tracing based methods, which are widely adopted to simulate propagation due to their efficiency and flexibility, but assume linear rays. We present a novel algorithm that traces analytic ray curves computed from local media gradients, and utilizes the closed-form solutions of both the intersections of the ray curves with planar surfaces, and the travel distance. By constructing an adaptive unstructured mesh, our algorithm is able to model general media profiles that vary in three dimensions with complex boundaries consisting of terrains and other scene objects such as buildings. Our analytic ray curve tracer with the adaptive mesh improves the efficiency considerably over prior methods. We highlight the algorithm's application on simulation of visual and sound propagation in outdoor scenes.
We present a new interaction handling model for physics-based sound synthesis in virtual environments. A new three-level surface representation for describing object shapes, visible surface bumpiness, and microscopic roughness (e.g. friction) is proposed to model surface contacts at varying resolutions for automatically simulating rich, complex contact sounds. This new model can capture various types of surface interaction, including sliding, rolling, and impact with a combination of three levels of spatial resolutions. We demonstrate our method by synthesizing complex, varying sounds in several interactive scenarios and a game-like virtual environment. The three-level interaction model for sound synthesis enhances the perceived coherence between audio and visual cues in virtual reality applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.