This paper proposes a three-dimensional (3D) point-of-intention (POI) determination method using multimodal fusion between hand pointing and eye gaze for a 3D virtual display. In the method, the finger joint forms of the pointing hand sensed by a Leap Motion sensor are first detected as pointing intention candidates. Subsequently, differences with neighboring frames, which should be during hand pointing period, are checked by AND logic with the hand-pointing intention candidates. A crossing point between the eye gaze and hand pointing lines is finally decided by the closest distance concept. In order to evaluate the performance of the proposed method, experiments with ten participants, in which they looked at and pointed at nine test points for approximately five second each, were performed. The experimental results show the proposed method measures 3D POIs at 75 cm, 85 cm, and 95 cm with average distance errors of 4.67%, 5.38%, and 5.71%, respectively.
People find it challenging to control smart systems with complex gaze gestures due to the vulnerability of eye saccades. Instead, the existing works achieved good recognition accuracy of simple gaze gestures because of sufficient eye gaze points but simple gaze gestures have limited applications compared to complex gaze gestures. Complex gaze gestures need a composition of multiple subunits of eye fixation to contain a sequence of gaze points that are clustered and rotated with an underlying complex head orientation relationship. This paper proposes a new set of eye gaze points and head orientation angles as new sequences to recognize complex gaze gestures. Eye gaze points and head orientation angles have a powerful influence on gaze gesture formation. The new sequence was obtained by aligning clustered gaze points and head orientation angles with a simple moving average (SMA) to denoise and interpolate the gap between successive eye fixations. The aligned new sequence of complex gaze gestures was utilized to train sequential machine learning (ML) algorithms. To evaluate the performance of the proposed method, we recruited and recorded the eye gaze and head orientation features of ten participants using an eye tracker. The results show that Boosted Hidden Markov Models (HMM) using Random Subspace methods achieved the best accuracies of 94.72% and 98.1% for complex, and simple gestures respectively, which outperformed the conventional methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.