Abstract-We present a system for 3D hand gesture recognition based on low-cost time-of-flight(ToF) sensors intended for outdoor use in automotive human-machine interaction. As signal quality is impaired compared to Kinect-type sensors, we study several ways to improve performance when a large number of gesture classes is involved. Our system fuses data coming from two ToF sensors which is used to build up a large database and subsequently train a multilayer perceptron (MLP). We demonstrate that we are able to reliably classify a set of ten hand gestures in real-time and describe the setup of the system, the utilised methods as well as possible application scenarios.
I. INTRODUCTIONAs "intelligent" devices enter more and more areas of everyday life, the issue of man-machine interaction becomes ever more important. As interaction should be easy and natural for the user and also not require a high cognitive load, non-verbal means of interaction such as hand gestures will play a decisive role in this field of research. With the advent of low-cost Kinect-type 3D sensors, and more recently of low-cost ToF sensors (400-500e) that can be applied in outdoor scenarios, the use of point clouds seems a very logical choice. This presents challenges to machine learning approaches as the data dimensionality and sensor noise are high, as well as the number of interesting gesture categories. In this article we build upon earlier results [1] and demonstrate how a system can be developed and integrated into a car in order to be able to classify a gesture alphabet of ten hand poses. Our approach is purely data-driven, i.e. by extending and applying a Pointcloud descriptor to our needs we are able to set up a real-time applicable system which is robust versus daylight interferences, invariant to rotation and translation problems and moreover works without the need to formalise a possibly complicated hand model. We will first discuss the related work relevant for our research (Sec. II) and then go on to describe the setup of our system within an automobile environment. Subsequently we describe the sensors and the used database in Sec. IV. In Sec. V we go on to give an account of the used different holistic point cloud descriptors and explain the meaning of the parameter variations we will test. Sec. VI summarises the implemented NN classes and the choice of parameters. The key questions we will investigate in Sec. VII concern the generalisation