Speech recognition can be a powerful tool for use in human -computer interaction, especially in situations where the user's hands are unavailable or otherwise engaged. Researchers have confirmed that existing mechanisms for speech-based cursor control are both slow and error prone. To address this, we evaluated two variations of a novel gridbased cursor controlled via speech recognition. One provides users with nine cursors that can be used to specify the desired location while the second, more traditional solution, provides a single cursor. Our results confirmed a speed/accuracy trade-off with a ninecursor variant allowing for faster task completion times while the one-cursor version resulted in reduced error rates. Our solutions eliminated the effect of distance, and dramatically reduced the importance of target size as compared to previous speech-based cursor control mechanisms. The results are explored through a predictive model and comparisons with results from earlier studies.
Desktop interaction solutions are often inappropriate for mobile devices due to small screen size and portability needs. Speech recognition can improve interactions by providing a relatively hands-free solution that can be used in various situations. While mobile systems are designed to be transportable, few have examined the effects of motion on mobile interactions. This paper investigates the effect of motion on automatic speech recognition (ASR) input for mobile devices. Speech recognition error rates (RER) have been examined with subjects walking or seated, while performing text input tasks and the effect of ASR enrollment conditions on RER. The obtained results suggest changes in user training of ASR systems for mobile and seated usage.
Speech text entry can be problematic during ideal dictation conditions, but difficulties are magnified when external conditions deteriorate. Motion during speech is an extraordinary condition that might have detrimental effects on automatic speech recognition. This research examined speech text entry while mobile. Speech enrollment profiles were created by participants in both a seated and walking environment. Dictation tasks were also completed in both the seated and walking conditions. Although results from an earlier study suggested that completing the enrollment process under more challenging conditions may lead to improved recognition accuracy under both challenging and less challenging conditions, the current study provided contradictory results. A detailed review of error rates confirmed that some participants minimized errors by enrolling under more challenging conditions while others benefited by enrolling under less challenging conditions. Still others minimized errors when different enrollment models were used under the opposing condition. Leveraging these insights, we developed a decision model to minimize recognition error rates regardless of the conditions experienced while completing dictation tasks. When applying the model to existing data, error rates were reduced significantly but additional research is necessary to effectively validate the proposed solution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.