Abstract. In this paper we describe the acquisition and content of a new large, realistic and challenging multi-modal database intended for training and testing multi-modal verification systems. The BANCA database was captured in four European languages in two modalities (face and voice). For recording, both high and low quality microphones and cameras were used. The subjects were recorded in three different scenarios, controlled, degraded and adverse over a period of three months. In total 208 people were captured, half men and half women. In this paper we also describe a protocol for evaluating verification algorithms on the database. The database will be made available to the research community through http://www.ee.surrey.ac.uk/Research/VSSP/banca.
Objective: Video and sound acquisition and processing technologies have seen great improvements in recent decades, with many applications in the biomedical area. The aim of this paper is to review the overall state of the art of advances within these topics in paediatrics and to evaluate their potential application for monitoring in the neonatal intensive care unit (NICU). Approach: For this purpose, more than 150 papers dealing with video and audio processing were reviewed. For both topics, clinical applications are described according to the considered cohorts—full-term newborns, infants and toddlers or preterm newborns. Then, processing methods are presented, in terms of data acquisition, feature extraction and characterization. Main results: The paper first focuses on the exploitation of video recordings; these began to be automatically processed in the 2000s and we show that they have mainly been used to characterize infant motion. Other applications, including respiration and heart rate estimation and facial analysis, are also presented. Audio processing is then reviewed, with a focus on the analysis of crying. The first studies in this field focused on induced-pain cries and the newest ones deal with spontaneous cries; the analyses are mainly based on frequency features. Then, some papers dealing with non-cry signals are also discussed. Significance: Finally, we show that even if recent improvements in digital video and signal processing allow for increased automation of processing, the context of the NICU makes a fully automated analysis of long recordings problematic. A few proposals for overcoming some of the limitations are given.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.