Abstract-Nonverbal communication plays an important role in many aspects of our lives, such as in job interviews, where vis-à-vis conversations take place. This paper proposes a method to automatically detect body communicative cues by using video sequences of the upper body of individuals in a conversational context. To our knowledge, our work brings novelty by explicitly addressing the recognition of visual activity in a seated, conversational setting from monocular video, compared to most existing work in video-based motion capture, which targets full-body with lower limb activities. We first detect the person hands in the sequence by searching for the higher speed parts along the whole video. Then, aided by training a set of typical conversational movements, we infer the approximate 3D upper body pose, that we transfer to a low-dimensionality space in order to perform action recognition. We test our system in the context of job interviews, with several new databases that we make publicly available.
Hand gestures and body posture are intimately linked to speech as they are used to enrich the vocal content, and are therefore inherently multimodal. As an important part of nonverbal behavior, body communication carries relevant information that can reveal social constructs as diverse as personality, internal states, or job interview outcomes. In this work, we analyze body communication cues in real dyadic employment interviews, where the protagonists of the interaction are seated. We use a mixture of body communicative features based on manual annotations and automated extraction methods to successfully predict two key organizational constructs, namely personality and job interview ratings. Our work also confirms the multimodal nature of body communication and shows that the speaking status can be used to improve the prediction performance of personality and hirability.
Panoptic segmentation has recently unified semantic and instance segmentation, previously addressed separately, thus taking a step further towards creating more comprehensive and efficient perception systems. In this paper, we present Panoster, a novel proposal-free panoptic segmentation method for LiDAR point clouds. Unlike previous approaches relying on several steps to group pixels or points into objects, Panoster proposes a simplified framework incorporating a learning-based clustering solution to identify instances. At inference time, this acts as a class-agnostic segmentation, allowing Panoster to be fast, while outperforming prior methods in terms of accuracy. Without any post-processing, Panoster reached state-of-theart results among published approaches on the challenging SemanticKITTI benchmark, and further increased its lead by exploiting heuristic techniques. Additionally, we showcase how our method can be flexibly and effectively applied on diverse existing semantic architectures to deliver panoptic predictions.
Estimating the uncertainty of a neural network plays a fundamental role in safety-critical settings. In perception for autonomous driving, measuring the uncertainty means providing additional calibrated information to downstream tasks, such as path planning, that can use it towards safe navigation. In this work, we propose a novel sampling-free uncertainty estimation method for object detection. We call it CertainNet, and it is the first to provide separate uncertainties for each output signal: objectness, class, location and size. To achieve this, we propose an uncertainty-aware heatmap, and exploit the neighboring bounding boxes provided by the detector at inference time. We evaluate the detection performance and the quality of the different uncertainty estimates separately, also with challenging out-of-domain samples: BDD100K and nuImages with models trained on KITTI. Additionally, we propose a new metric to evaluate location and size uncertainties. When transferring to unseen datasets, CertainNet generalizes substantially better than previous methods and an ensemble, while being real-time and providing high quality and comprehensive uncertainty estimates.
Panoptic segmentation has recently unified semantic and instance segmentation, previously addressed separately, thus taking a step further towards creating more comprehensive and efficient perception systems. In this paper, we present Panoster, a novel proposal-free panoptic segmentation method for point clouds. Unlike previous approaches relying on several steps to group pixels or points into objects, Panoster proposes a simplified framework incorporating a learning-based clustering solution to identify instances. At inference time, this acts as a class-agnostic semantic segmentation, allowing Panoster to be fast, while outperforming prior methods in terms of accuracy. Additionally, we showcase how our approach can be flexibly and effectively applied on diverse existing semantic architectures to deliver panoptic predictions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.