Ben Shirley scite author profile

The media industry is currently being pulled in the often-opposing\ud directions of increased realism (high resolution, stereoscopic,\ud large screen) and personalization (selection and control of\ud content, availability on many devices). We investigate the\ud feasibility of an end-to-end format-agnostic approach to support\ud both these trends. In this paper, different aspects of a format-\ud agnostic capture, production, delivery and rendering system are\ud discussed. At the capture stage, the concept of layered scene\ud representation is introduced, including panoramic video and 3D\ud audio capture. At the analysis stage, a virtual director component\ud is discussed that allows for automatic execution of\ud cinematographic principles, using feature tracking and saliency\ud detection. At the delivery stage, resolution-independent\ud audiovisual transport mechanisms for both managed and\ud unmanaged networks are treated. In the rendering stage, a\ud rendering process that includes the manipulation of audiovisual\ud content to match the connected display and loudspeaker properties\ud is introduced. Different parts of the complete system are revisited\ud demonstrating the requirements and the potential of this advanced\ud concept.Peer ReviewedPostprint (published version

show abstract

The Effect of Situation-Specific Non-Speech Acoustic Cues on the Intelligibility of Speech in Noise

Ward

Shirley

Tang

et al. 2017

View full text Add to dashboard Cite

In everyday life, speech is often accompanied by a situationspecific acoustic cue; a hungry bark as you ask 'Has anyone fed the dog?'. This paper investigates the effect such cues have on speech intelligibility in noise and evaluates their interaction with the established effect of situation-specific semantic cues. This work is motivated by the introduction of new object-based broadcast formats, which have the potential to optimise intelligibility by controlling the level of individual broadcast audio elements, at point of service. Results of this study show that situation-specific acoustic cues alone can improve word recognition in multi-talker babble by 69.5%, a similar amount to semantic cues. The combination of both semantic and acoustic cues provide further improvement of 106.0% compared with no cues, and 18.7% compared with semantic cues only. Interestingly, whilst increasing subjective intelligibility of the target word, the presence of acoustic cues degraded the objective intelligibility of the speech-based semantic cues by 47.0% (equivalent to reducing the speech level by 4.5 dB). This paper discusses the interactions between the two types of cues and the implications that these results have for assessing and improving the intelligibility of broadcast speech.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ben Shirley

Preferred Levels for Background Ducking to Produce Esthetically Pleasing Audio for TV with Clear Speech

Personalized Object-Based Audio for Hearing Impaired TV Viewers

Clean Audio for TV broadcast: An Object-Based Approach for Hearing-Impaired Viewers

Towards a format-agnostic approach for production, delivery and rendering of immersive media

The Effect of Situation-Specific Non-Speech Acoustic Cues on the Intelligibility of Speech in Noise

Contact Info

Product

Resources

About