During the development of audio coding schemes, a number of methods for evaluation of the perceived audio quality have been developed. To enable comparisons across test sites, several methods have been standardized. In standardized methods like ITU-R Recommendations BS.1116 and BS.1534 (MUSHRA), the output of a codec (signal under test) is compared to an open reference. This reference is the unimpaired input of the codec. Assuming that the codec is “transparent,” the signal under test should sound exactly like this reference. For object-based audio, the input of a codec is a combination of raw audio channels and metadata describing position and other properties of the audio objects. It does not make sense to listen directly to the raw data. For listening, it is necessary to calculate the driving signal for each loudspeaker available in the listening room (rendering). Therefore, the comparison of different renderers is difficult: the renderer used to generate the reference signal has an advantage. Using a dedicated loudspeaker as the reference does not solve the problem either: loudspeakers always sound different than virtual sound objects. The presentation will discuss problems and solutions in more detail. Some promising setups based on multi attribute testing are presented.
In many urban areas, traffic load and noise pollution are constantly increasing. Automated systems for traffic monitoring are promising countermeasures, which allow to systematically quantify and predict local traffic flow in order to to support municipal traffic planning decisions. In this paper, we present a novel open benchmark dataset, containing 2.5 hours of stereo audio recordings of 4718 vehicle passing events captured with both high-quality sE8 and medium-quality MEMS microphones. This dataset is well suited to evaluate the use-case of deploying audio classification algorithms to embedded sensor devices with restricted microphone quality and hardware processing power. In addition, this paper provides a detailed review of recent acoustic traffic monitoring (ATM) algorithms as well as the results of two benchmark experiments on vehicle type classification and direction of movement estimation using four state-of-the-art convolutional neural network architectures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.