We present a gesture recognition system for recognizing hand movements in near realtime. The system uses a infra-red time-of-flight range camera with up to 30 Hz framerate to measure 3d-surface points captured from the hand of the user. The measured data is transformed into a cloud of 3d-points after depth keying and suppression of camera noise by median filtering. Principle component analysis (PCA) is used to obtain a first crude estimate on the location and orientation of the hand. An articulated hand model is fitted to the data to refine the first estimate. The unoptimized system is able to estimate the first 7 Degrees of Freedom of the hand within 200 ms. The reconstructed hand is visualized in AVANGO/Performer and can be utilized to implement a natural man-machine interface. The work reviews relevant publications, underlines the advantages and shortcomings of the approach and provides an outlook on future improvements.
In this paper, our techniques used in TRECVID'08 on BBC rush summarization are described. Firstly, rush videos are hierarchical modeled using formal language description. Then, shot detection and V-unit determination are applied for video structuring; junk frames within the model are also effectively removed. Thirdly, adaptive clustering is employed to group shots into clusters to remove retakes. Then, each selected shot is ranked according to its length and sum of activity level for summarization. Competitive results have proved the effectiveness and efficiency of our techniques fully implemented in compressed-domain
We report on the workflow for the creation of realistic virtual anthropomorphic characters. 3D-models of human heads have been reconstructed from real people by following a structured light approach to 3D-reconstruction. We describe how these high-resolution models have been simplified and articulated with blend shape and mesh skinning techniques to ensure real-time animation. The full-body models have been created manually based on photographs. We present a system for capturing whole body motions, including the fingers, based on an optical motion capture system with 6 DOF rigid bodies and cybergloves. The motion capture data was processed in one system, mapped to a virtual character and visualized in real-time. We developed tools and methods for quick post processing. To demonstrate the viability of our system, we captured a library consisting of more than 90 gestures.
One of the biggest obstacles for constructing effective sociable virtual humans lies in the failure of machines to recognize the desires, feelings and intentions of the human user. Virtual humans lack the ability to fully understand and decode the communication signals human users emit when communicating with each other. This article describes our research in overcoming this problem by developing senses for the virtual humans which enables them to hear and understand human speech, localize the human user in front of the display system, recognize hand postures and to recognize the emotional state of the human user by classifying facial expression. We report on the methods needed to perform these tasks in real-time and conclude with an outlook on promising research issues of the future.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.