The Visual Object Tracking challenge VOT2018 is the sixth annual tracker benchmarking activity organized by the VOT initiative. Results of over eighty trackers are presented; many are state-of-the-art trackers published at major computer vision conferences or in journals in the recent years. The evaluation included the standard VOT and other popular methodologies for short-term tracking analysis and a "real-time" experiment simulating a situation where a tracker processes images as if provided by a continuously running sensor. A long-term tracking subchallenge has been introduced to the set of standard VOT sub-challenges. The new subchallenge focuses on long-term tracking properties, namely coping with target disappearance and reappearance. A new dataset has been compiled and a performance evaluation methodology that focuses on long-term tracking capabilities has been adopted. The VOT toolkit has been updated to support both standard short-term and the new longterm tracking subchallenges. Performance of the tested trackers typically by far exceeds standard baselines. The source code for most of the trackers is publicly available from the VOT page. The dataset, the evaluation kit and the results are publicly available at the challenge website 60 .
In search and rescue operations, it is crucial to rapidly identify those people who are alive from those who are not. If this information is known, emergency teams can prioritize their operations to save more lives. However, in some natural disasters the people may be lying on the ground covered with dust, debris, or ashes making them difficult to detect by video analysis that is tuned to human shapes. We present a novel method to estimate the locations of people from aerial video using image and signal processing designed to detect breathing movements. We have shown that this method can successfully detect clearly visible people and people who are fully occluded by debris. First, the aerial videos were stabilized using the key points of adjacent image frames. Next, the stabilized video was decomposed into tile videos and the temporal frequency bands of interest were motion magnified while the other frequencies were suppressed. Image differencing and temporal filtering were performed on each tile video to detect potential breathing signals. Finally, the detected frequencies were remapped to the image frame creating a life signs map that indicates possible human locations. The proposed method was validated with both aerial and ground recorded videos in a controlled environment. Based on the dataset, the results showed good reliability for aerial videos and no errors for ground recorded videos where the average precision measures for aerial videos and ground recorded videos were 0.913 and 1 respectively.
BackgroundRemote physiological measurement might be very useful for biomedical diagnostics and monitoring. This study presents an efficient method for remotely measuring heart rate and respiratory rate from video captured by a hovering unmanned aerial vehicle (UVA). The proposed method estimates heart rate and respiratory rate based on the acquired signals obtained from video-photoplethysmography that are synchronous with cardiorespiratory activity.MethodsSince the PPG signal is highly affected by the noise variations (illumination variations, subject’s motions and camera movement), we have used advanced signal processing techniques, including complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and canonical correlation analysis (CCA) to remove noise under these assumptions.ResultsTo evaluate the performance and effectiveness of the proposed method, a set of experiments were performed on 15 healthy volunteers in a front-facing position involving motion resulting from both the subject and the UAV under different scenarios and different lighting conditions.ConclusionThe experimental results demonstrated that the proposed system with and without the magnification process achieves robust and accurate readings and have significant correlations compared to a standard pulse oximeter and Piezo respiratory belt. Also, the squared correlation coefficient, root mean square error, and mean error rate yielded by the proposed method with and without the magnification process were significantly better than the state-of-the-art methodologies, including independent component analysis (ICA) and principal component analysis (PCA).
In the aftermath of a disaster, such as earthquake, flood, or avalanche, ground search for survivors is usually hampered by unstable surfaces and difficult terrain. Drones now play an important role in these situations, allowing rescuers to locate survivors and allocate resources to saving those who can be helped. The aim of this study was to explore the utility of a drone equipped for human life detection with a novel computer vision system. The proposed system uses image sequences captured by a drone camera to remotely detect the cardiopulmonary motion caused by periodic chest movement of survivors. The results of eight human subjects and one mannequin in different poses shows that motion detection on the body surface of the survivors is likely to be useful to detect life signs without any physical contact. The results presented in this study may lead to a new approach to life detection and remote life sensing assessment of survivors.
Aerial human action recognition is an emerging topic in drone applications. Commercial drone platforms capable of detecting basic human actions such as hand gestures have been developed. However, a limited number of aerial video datasets are available to support increased research into aerial human action analysis. Most of the datasets are confined to indoor scenes or object tracking and many outdoor datasets do not have sufficient human body details to apply state-of-the-art machine learning techniques. To fill this gap and enable research in wider application areas, we present an action recognition dataset recorded in an outdoor setting. A free flying drone was used to record 13 dynamic human actions. The dataset contains 240 high-definition video clips consisting of 66,919 frames. All of the videos were recorded from low-altitude and at low speed to capture the maximum human pose details with relatively high resolution. This dataset should be useful to many research areas, including action recognition, surveillance, situational awareness, and gait analysis. To test the dataset, we evaluated the dataset with a pose-based convolutional neural network (P-CNN) and high-level pose feature (HLPF) descriptors. The overall baseline action recognition accuracy calculated using P-CNN was 75.92%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.