2013
DOI: 10.1007/s11042-013-1472-2
|View full text |Cite
|
Sign up to set email alerts
|

Object-based audio for interactive football broadcast

Abstract: An end-to-end AV broadcast system providing an immersive, interactive experience for live events is the development aim for the EU FP7 funded project, FascinatE. The project has developed real time audio object event detection and localisation, scene modelling and processing methods for multimedia data including 3D audio, which will allow users to navigate the event by creating their own unique user-defined scene. As part of the first implementation of the system a test shoot was carried out capturing a live P… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2013
2013
2019
2019

Publication Types

Select...
6
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 17 publications
(10 citation statements)
references
References 20 publications
0
10
0
Order By: Relevance
“…Here, Θ denotes the DNN parameters. Its example on a fully connected DNN is described later (after (16)). The target and observation source are assumed to be vectorized for all frequency bins as…”
Section: B Maximum-likelihood-based Dnn Training For T-f Mask Estimamentioning
confidence: 99%
See 1 more Smart Citation
“…Here, Θ denotes the DNN parameters. Its example on a fully connected DNN is described later (after (16)). The target and observation source are assumed to be vectorized for all frequency bins as…”
Section: B Maximum-likelihood-based Dnn Training For T-f Mask Estimamentioning
confidence: 99%
“…First, the i-th observation utterance X (i) is simulated by (1) using a randomly selected target-source file and a noise source with equal frame size from the training dataset. Next, the T-F mask G(x (i) τ ) and variance σ(x (i) τ ) are estimated by (11)- (16). Then, to simulate the k-th output signalŜ (i,k) , the temporary output signalS (i,k) ω,τ is sampled from the complex Gaussian distribution using a pseudo random number generator, such as the Mersenne-Twister [44], as…”
Section: Training Proceduresmentioning
confidence: 99%
“…However, in future content production, it might be possible to capture media assets directly into an object-based form. This approach has been applied for experimental live sports broadcasting [20], [21], but is not yet commonplace for audio production. The proposed system offers new opportunities for object-based audio capture based on performer tracking (to obtain metadata) and the application of BSS and beamforming techniques to spatial audio capture (to acquire separated object audio).…”
Section: B Component-based Designmentioning
confidence: 99%
“…In current TV broadcasts these discrete audio object categories are not available at any point in the broadcast production chain and therefore the end user has no control over the relative levels of these sounds. In order to provide these three sound sources as independent and controllable entities some considerable development had to take place in the acquisition and production techniques used to capture a complex sound scene such as that found at a live sports event [64]. Currently the key objectives for audio in football coverage are twofold; picking up sounds on the pitch as clearly as possible during the game and utilizing the 5.1 surround sound capability to give the viewer a sense of immersion and of "being there."…”
Section: Clean Audio For Live Sports Coveragementioning
confidence: 99%