Objective: Video and sound acquisition and processing technologies have seen great improvements in recent decades, with many applications in the biomedical area. The aim of this paper is to review the overall state of the art of advances within these topics in paediatrics and to evaluate their potential application for monitoring in the neonatal intensive care unit (NICU). Approach: For this purpose, more than 150 papers dealing with video and audio processing were reviewed. For both topics, clinical applications are described according to the considered cohorts—full-term newborns, infants and toddlers or preterm newborns. Then, processing methods are presented, in terms of data acquisition, feature extraction and characterization. Main results: The paper first focuses on the exploitation of video recordings; these began to be automatically processed in the 2000s and we show that they have mainly been used to characterize infant motion. Other applications, including respiration and heart rate estimation and facial analysis, are also presented. Audio processing is then reviewed, with a focus on the analysis of crying. The first studies in this field focused on induced-pain cries and the newest ones deal with spontaneous cries; the analyses are mainly based on frequency features. Then, some papers dealing with non-cry signals are also discussed. Significance: Finally, we show that even if recent improvements in digital video and signal processing allow for increased automation of processing, the context of the NICU makes a fully automated analysis of long recordings problematic. A few proposals for overcoming some of the limitations are given.