A versatile dataset for text entry evaluations based on genuine mobile emails

Proceedings of the 2014 International Working Conference on Advanced Visual Interfaces

2014

Self Cite

Dasher is a promising fast assistive gaze communication method. However, previous evaluations of Dasher have been inconclusive. Either the studies have been too short, involved too few participants, suffered from sampling bias, lacked a control condition, used an inappropriate language model, or a combination of the above. To rectify this, we report results from two new evaluations of Dasher carried out using a Tobii P10 assistive eye-tracker machine. We also present a method of modifying Dasher so that it can use a state-of-the-art long-span statistical language model. Our experimental results show that compared to a baseline eye-typing method, Dasher resulted in significantly faster entry rates (12.6 wpm versus 6.0 wpm in Experiment 1, and 14.2 wpm versus 7.0 wpm in Experiment 2). These faster entry rates were possible while maintaining error rates comparable to the baseline eye-typing method. Participants' perceived physical demand, mental demand, effort and frustration were all significantly lower for Dasher. Finally, participants significantly rated Dasher as being more likeable, requiring less concentration and being more fun.

“…As stimuli we used the Mobile Enron data set [15]. This set consists of sentences from genuine mobile emails that have been validated to be memorable.…”

Section: Methodsmentioning

confidence: 99%

Section: Language Model Comparisonsmentioning

confidence: 99%

An evaluation of Dasher with a high-performance language model as a gaze communication method

Rough

Proceedings of the 2014 International Working Conference on Advanced Visual Interfaces

2014

Self Cite

“…Paek and Hsu [5] follow up on MacKenzie and Soukoreff's argument of the importance of using representative phrase sets and propose a method for sampling representative phrase sets from large bodies of text. Recently, we released a new phrase set based on genuine mobile emails that have been validated for memorability [7]. We argue in [7] that this set's real-world data and demonstrated memorability should increase both internal and external validity in text entry evaluations.…”

Section: Introductionmentioning

confidence: 99%

Performance comparisons of phrase sets and presentation styles for text entry evaluations

Proceedings of the 2012 ACM International Conference on Intelligent User Interfaces

2012

Self Cite

We empirically compare five different publicly-available phrase sets in two large-scale (N = 225 and N = 150) crowdsourced text entry experiments. We also investigate the impact of asking participants to memorize phrases before writing them versus allowing participants to see the phrase during text entry. We find that asking participants to memorize phrases increases entry rates at the cost of slightly increased error rates. This holds for both a familiar and for an unfamiliar text entry method. We find statistically significant differences between some of the phrase sets in terms of both entry and error rates. Based on our data, we arrive at a set of recommendations for choosing suitable phrase sets for text entry evaluations.

“…In our first experiment, 14 participants entered 20 sentences chosen at random from short memorable sentences from the Enron mobile test set [4]. All participants were familiar with the Qwerty keyboard.…”

Section: Data Collection and Resultsmentioning

confidence: 99%

The feasibility of eyes-free touchscreen keyboard typing

Memmi

Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility

2013

Self Cite

Typing on a touchscreen keyboard is very difficult without being able to see the keyboard. We propose a new approach in which users imagine a Qwerty keyboard somewhere on the device and tap out an entire sentence without any visual reference to the keyboard and without intermediate feed back about the letters or words typed. To demonstrate the feasibility of our approach, we developed an algorithm that decodes blind touchscreen typing with a character error rate of 18.5%. Our decoder currently uses three components: a model of the keyboard topology and tap variability, a point transformation algorithm, and a long-span statistical lan guage model. Our initial results demonstrate that our pro posed method provides fast entry rates and promising error rates. On one-third of the sentences, novices' highly noisy input was successfully decoded with no errors. Categories and Subject Descriptors MOTIVATION AND APPROACHEntering text on a touchscreen mobile device typically in volves visually-guided tapping on a Qwerty keyboard. For users who are blind, visually-impaired, or using a device eyes-free, such visually-guided tapping is difficult or impos sible. Existing approaches are slow (e.g. the split-tapping method of the iPhone's VoiceOver feature), require chorded Braille input (e.g. Perkinput [1], BrailleTouch [3]), or require word-at-a-time confirmation and correction (e.g. the Fleksy iPhone/Android app by Syntellia).Rather than designing a letter-or word-at-a-time recogni tion interface, we present initial results on an approach in which recognition is postponed until an entire sentence of noisy tap data is collected. This may improve users' effi ciency by avoiding the distraction of intermediate letter-or word-level recognition results. Users enter a whole sequence of taps on a keyboard they imagine somewhere on the screen but cannot actually see. We then decode the user's entire intended sentence from the imprecise tap data. Our recog nizer searches for the most likely character sequence under a probabilistic keyboard and language model.The keyboard model places a 2D Gaussian with a diagonal covariance matrix on each key. For each tap, the model pro duces a likelihood for each of the possible letters on the keyboard with higher likelihoods for letters closer to the tap's location. Our 9-gram character language model uses Witten-Bell smoothing and was trained on billions of words of Twitter, Usenet and blog data. The language model has 9.8 M parameters and a compressed disk size of 67 MB.Since users are imagining the keyboard's location and size, their actual tap locations are unlikely to correspond well with any fixed keyboard location. We compensate for this by geometrically transforming the tap points as shown in Figure 1. We allow taps to be scaled along the x-and y-dimensions, translated horizontally and vertically, and rotated by up to 20 degrees. We also search for two multiplicative factors that adjust the x-and y-variance of the 2D Gaussians.Our current decoder operates offline, finding the best ...