Abstract-Robust object recognition is a crucial ingredient of many, if not all, real-world robotics applications. This paper leverages recent progress on Convolutional Neural Networks (CNNs) and proposes a novel RGB-D architecture for object recognition. Our architecture is composed of two separate CNN processing streams -one for each modality -which are consecutively combined with a late fusion network. We focus on learning with imperfect sensor data, a typical problem in real-world robotics tasks. For accurate learning, we introduce a multi-stage training methodology and two crucial ingredients for handling depth data with CNNs. The first, an effective encoding of depth information for CNNs that enables learning without the need for large depth datasets. The second, a data augmentation scheme for robust learning with depth images by corrupting them with realistic noise patterns. We present stateof-the-art results on the RGB-D object dataset [15] and show recognition in challenging RGB-D real-world noisy settings.
Abstract-Developing the perfect SLAM front-end that produces graphs which are free of outliers is generally impossible due to perceptual aliasing. Therefore, optimization back-ends need to be able to deal with outliers resulting from an imperfect frontend. In this paper, we introduce dynamic covariance scaling, a novel approach for effective optimization of constraint networks under the presence of outliers. The key idea is to use a robust function that generalizes classical gating and dynamically rejects outliers without compromising convergence speed. We implemented and thoroughly evaluated our method on publicly available datasets. Compared to recently published state-of-theart methods, we obtain a substantial speed up without increasing the number of variables in the optimization process. Our method can be easily integrated in almost any SLAM back-end.
Abstract-People detection is a key issue for robots and intelligent systems sharing a space with people. Previous works have used cameras and 2D or 3D range finders for this task. In this paper, we present a novel people detection approach for RGB-D data. We take inspiration from the Histogram of Oriented Gradients (HOG) detector to design a robust method to detect people in dense depth data, called Histogram of Oriented Depths (HOD). HOD locally encodes the direction of depth changes and relies on an depth-informed scale-space search that leads to a 3-fold acceleration of the detection process. We then propose Combo-HOD, a RGB-D detector that probabilistically combines HOD and HOG. The experiments include a comprehensive comparison with several alternative detection approaches including visual HOG, several variants of HOD, a geometric person detector for 3D point clouds, and an Haar-based AdaBoost detector. With an equal error rate of 85% in a range up to 8m, the results demonstrate the robustness of HOD and Combo-HOD on a real-world data set collected with a Kinect sensor in a populated indoor environment.
Abstract-The ability to act in a socially-aware way is a key skill for robots that share a space with humans. In this paper we address the problem of socially-aware navigation among people that meets objective criteria such as travel time or path length as well as subjective criteria such as social comfort. Opposed to modelbased approaches typically taken in related work, we pose the problem as an unsupervised learning problem. We learn a set of dynamic motion prototypes from observations of relative motion behavior of humans found in publicly available surveillance data sets. The learned motion prototypes are then used to compute dynamic cost maps for path planning using an any-angle A* algorithm. In the evaluation we demonstrate that the learned behaviors are better in reproducing human relative motion in both criteria than a Proxemics-based baseline method.
Human activity recognition is a key component for socially enabled robots to effectively and naturally interact with humans. In this paper we exploit the fact that many human activities produce characteristic sounds from which a robot can infer the corresponding actions. We propose a novel recognition approach called Non-Markovian Ensemble Voting (NEV) able to classify multiple human activities in an online fashion without the need for silence detection or audio stream segmentation. Moreover, the method can deal with activities that are extended over undefined periods in time.In a series of experiments in real reverberant environments, we are able to robustly recognize 22 different sounds that correspond to a number of human activities in a bathroom and kitchen context. Our method outperforms several established classification techniques.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.