Yan‐Chen Lu scite author profile

This paper describes the acquisition and content of a new multi-modal database. Some tools for making use of the data streams are also presented. The Computational AudioVisual Analysis (CAVA) database is a unique collection of three synchronised data streams obtained from a binaural microphone pair, a stereoscopic camera pair and a head tracking device. All recordings are made from the perspective of a person; i.e. what would a human with natural head movements see and hear in a given environment. The database is intended to facilitate research into humans' ability to optimise their multi-modal sensory input and fills a gap by providing data that enables human centred audiovisual scene analysis. It also enables 3D localisation using either audio, visual, or audiovisual cues. A total of 50 sessions, with varying degrees of visual and auditory complexity, were recorded. These range from seeing and hearing a single speaker moving in and out of field of view, to moving around a 'cocktail party' style situation, mingling and joining different small groups of people chatting.

show abstract

Hardware-Accelerated Vehicle License Plate Detection at High-Definition Image

Yang¹,

Lu²,

Chen³

et al. 2011

View full text Add to dashboard Cite

Active binaural distance estimation for dynamic sources

Lu¹,

Cooke²,

Christensen

2007

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yan‐Chen Lu

Motion strategies for binaural localisation of speech sources in azimuth and distance by artificial listeners

Real time color based particle filtering for object tracking with dual cache architecture

The CAVA corpus

Hardware-Accelerated Vehicle License Plate Detection at High-Definition Image

Active binaural distance estimation for dynamic sources

Contact Info

Product

Resources

About