Supervised Descent Method (SDM) has proven successful in many computer vision applications such as face alignment, tracking and camera calibration. Recent studies which used SDM, achieved state of the-art performance on facial landmark localization in depth images [4]. In this study, we propose to use ridge regression instead of least squares regression for learning the SDM, and to change feature sizes in each iteration, effectively turning the landmark search into a coarse to fine process. We apply the proposed method to facial landmark localization on the Bosphorus 3D Face Database; using frontal depth images with no occlusion. Experimental results confirm that both ridge regression and using adaptive feature sizes improve the localization accuracy considerably.
Automatic prediction of personalities from meeting videos is a classical machine learning problem. Psychologists define personality traits as uncorrelated long-term characteristics of human beings. However, human annotations of personality traits introduce cultural and cognitive bias. In this study, we present methods to automatically predict emergent leadership and personality traits in the group meeting videos of the Emergent LEAdership corpus. Prediction of extraversion has attracted the attention of psychologists as it is able to explain a wide range of behaviors, predict performance, and assess risk. Prediction of emergent leadership, on the other hand, is of great importance for the business community. Therefore, we focus on the prediction of extraversion and leadership since these traits are also strongly manifested in a meeting scenario through the extracted features. We use feature analysis and multi-task learning methods in conjunction with the non-verbal features and crowd-sourced annotations from the Video bLOG (VLOG) corpus to perform a multi-domain and multi-task prediction of personality traits. Our results indicate that multi-task learning methods using 10 personality annotations as tasks and with a transfer from two different datasets from different domains improve the overall recognition performance. Preventing negative transfer by using a forward task selection scheme yields the best recognition results with 74.5% accuracy in leadership and 81.3% accuracy in extraversion traits. These results demonstrate the presence of annotation bias as well as the benefit of transferring information from weakly similar domains.
In this paper, we propose a set of features called temporal accumulative features (TAF) for representing and recognizing isolated sign language gestures. By incorporating sign language specific constructs to better represent the unique linguistic characteristic of sign language videos, we have devised an efficient and fast SLR method for recognizing isolated sign language gestures. The proposed method is an HSV based accumulative video representation where keyframes based on the linguistic movement-hold model are represented by different colors. We also incorporate hand shape information and using a small scale convolutional neural network, demonstrate that sequential modeling of accumulative features for linguistic subunits improves upon baseline classification results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.