2019
DOI: 10.3813/aaa.919349
|View full text |Cite
|
Sign up to set email alerts
|

The Ambisonic Recordings of Typical Environments (ARTE) Database

Abstract: Everyday listening environments are characterized by far more complex spatial, spectral and temporal sound field distributions than the acoustic stimuli that are typically employed in controlled laboratory settings. As such, the reproduction of acoustic listening environments has become important for several research avenues related to sound perception, such as hearing loss rehabilitation, soundscapes, speech communication, auditory scene analysis, automatic scene classification, and room acoustics. However, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
31
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 36 publications
(31 citation statements)
references
References 0 publications
0
31
0
Order By: Relevance
“…The ECO-SiN corpus comprises 192 naturally spoken sentences, in which four lists of 16 sentences were spoken with three different vocal efforts. The average sentence length is 6.3 words, and an example sentence is “That discovery was like really interesting for me.” In brief, the sentences were extracted from two people engaging in unscripted conversation while they listened to three different realistic background noises from the ARTE database ( Weisser et al, 2019b ); a church, an indoor café, a busy food court (see Table 2 ) via highly open headphones. The background noises were selected based on the conversational speech levels determined by Weisser and Buchholz (2019) .…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…The ECO-SiN corpus comprises 192 naturally spoken sentences, in which four lists of 16 sentences were spoken with three different vocal efforts. The average sentence length is 6.3 words, and an example sentence is “That discovery was like really interesting for me.” In brief, the sentences were extracted from two people engaging in unscripted conversation while they listened to three different realistic background noises from the ARTE database ( Weisser et al, 2019b ); a church, an indoor café, a busy food court (see Table 2 ) via highly open headphones. The background noises were selected based on the conversational speech levels determined by Weisser and Buchholz (2019) .…”
Section: Methodsmentioning
confidence: 99%
“…The background noises were drawn from the ARTE database ( Weisser et al, 2019b ), which were recorded with a 62-channel hard-sphere microphone array and encoded into the higher-order Ambisonics (HOA) format. They were then decoded here for simulated playback with the spherical 41-channel loudspeaker array inside the anechoic chamber of the Australian Hearing Hub, Macquarie University.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…For the perturbation set, we consider all degradations commonly found in various audio processing tasks including additive noise, speech distortions (e.g. clipping and frequency masking, frequency resampling, pitch shifting), compression (e.g., mu-law and MP3), and recorded binaural sounds [30,31]. We also use the data collected using the binaural multichannel wiener filter (MWF) [32] algorithm, and find that adding datasets and perturbations with subtle differences increases the robustness of our model to small differences.…”
Section: Datasets and Trainingmentioning
confidence: 99%