The conventional dwell-based methods for text entry by gaze are typically slow and uncomfortable. A swipe-based method that maps gaze path into words offers an alternative. However, it requires the user to explicitly indicate the beginning and ending of a word, which is typically achieved by tedious gazeonly selection. This paper introduces TAGSwipe, a bi-modal method that combines the simplicity of touch with the speed of gaze for swiping through a word. The result is an efficient and comfortable dwell-free text entry method. In the lab study TAGSwipe achieved an average text entry rate of 15.46 wpm and significantly outperformed conventional swipe-based and dwell-based methods in efficacy and user satisfaction.
Eye tracking has evolved as a promising hands-free interaction mechanism to support people with disabilities. However, its adoption as a control mechanism in the gaming environment is constrained due to erroneous recognition of user intention and commands. Previous studies have suggested combining eye gaze with other modalities like voice input for improved interaction experience. However, speech recognition latency and accuracy is a major bottleneck, and the use of dictated verbal commands can disrupt the flow in gaming environment. Furthermore, several people with physical disabilities also suffer from speech impairments to utter precise verbal voice commands. In this work, we introduce nonverbal voice interaction (NVVI) to synchronize with gaze for an intuitive hands-free gaming experience. We propose gaze and NVVI (e.g., humming) for a spatio-temporal interaction applicable to several modern gaming apps, and developed 'All Birds Must Fly' as a representative app. In the experiment, we first compared the gameplay experience of gaze and NVVI (GV) with the conventional mouse and keyboard (MK) in a study with 15 non-disabled participants. The participants could effectively control the game environment with GV (expectedly a bit slower than MK). More importantly, they found GV more engaging, fun, and enjoyable. In a second study with 10 participants, we successfully validated the feasibility of GV with a target user group of people with disabilities.
Non-verbal voice expressions (NVVEs) have been adopted as a means of human-computer interaction in research studies. However, exploring non-verbal voice-based interactions has been constrained by the limited availability of suitable training data and computational methods for classifying such expressions, leading to a focus on simple binary inputs. We address this issue with a new dataset containing 950 audio samples comprising 6 classes of voice expressions. The data were collected from 42 speakers who donated voice recordings. The classifier was trained on the data using features derived from mel-spectrograms. Furthermore, we studied the effectiveness of data augmentation and improved over the baseline model accuracy significantly with a test accuracy of 96.6% in a 5-fold cross-validation. We have made CNVVE publicly accessible in the hope that it will serve as a benchmark for future research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.