Proceedings of the 30th ACM International Conference on Multimedia 2022
DOI: 10.1145/3503161.3548253
|View full text |Cite
|
Sign up to set email alerts
|

MESH2IR: Neural Acoustic Impulse Response Generator for Complex 3D Scenes

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 14 publications
(5 citation statements)
references
References 24 publications
0
5
0
Order By: Relevance
“…In ref. 25, a 3D scene mesh is encoded to a latent space and combined with source/receiver pairs to produce an embedding that is passed through a generative adversarial network for IR generation; frameworks predicting binaural IRs inputting source/receiver pairs, head orientation, and time are presented in refs. 26 and 27, where the former is taking into account a local 2D feature grid representing a static scene, whereas the latter is motivated by acoustic radiance transfer considering surface points for static scenes; in ref.…”
Section: Significancementioning
confidence: 99%
“…In ref. 25, a 3D scene mesh is encoded to a latent space and combined with source/receiver pairs to produce an embedding that is passed through a generative adversarial network for IR generation; frameworks predicting binaural IRs inputting source/receiver pairs, head orientation, and time are presented in refs. 26 and 27, where the former is taking into account a local 2D feature grid representing a static scene, whereas the latter is motivated by acoustic radiance transfer considering surface points for static scenes; in ref.…”
Section: Significancementioning
confidence: 99%
“…Because the environment, emitter and receiver are static, so are the acoustics. Other work predicts impulse responses in simulation either for a single environment [28], or by using few-shot egocentric observations [30], or by using the 3D scene mesh [44]. While simulated results are satisfying, those models' impact on real-world data is unknown, especially for scenarios where human speakers move and interact with each other.…”
Section: Related Workmentioning
confidence: 99%
“…In recent years, generative models, particularly generative adversarial networks (GANs) [21], have gained significant attention and success in various domains, including audio super-resolution and RIR synthesis [22][23][24][25]. The GAN framework consists of a generator and a discriminator.…”
Section: Introductionmentioning
confidence: 99%