Investigating Speech Recognition for Improving Predictive AAC

Adhikary, Jiban; Watling, Robbie; Fletcher, Crystal; Stanage, Alex M.; Vertanen, Keith

doi:10.18653/v1/w19-1706

Cited by 2 publications

(1 citation statement)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For text entry in AAC, Wisenburn and Higginbotham (2008) demonstrated that providing noun phrases from a conversation partner's speech as selection options increases text-entry speed by 36.7%. Adhikary et al (2019) concluded that with currently-attainable accuracy of ASR, partner speech can be valuable in improving language modeling for AAC text entry. Shen et al (2022) used a fine-tuned GPT-2 model (Radford et al, 2019) to expand bags of keywords into full phrases in conversational contexts based on the ConvAI2 dataset and reported a KSR of 77% at a word error rate threshold of 0.65.…”

Section: Introductionmentioning

confidence: 99%

Context-Aware Abbreviation Expansion Using Large Language Models

Cai¹,

Venugopalan²,

Tomanek³

et al. 2022

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

Motivated by the need for accelerating text entry in augmentative and alternative communication (AAC) for people with severe motor impairments, we propose a paradigm in which phrases are abbreviated aggressively as primarily word-initial letters. Our approach is to expand the abbreviations into full-phrase options by leveraging conversation context with the power of pretrained large language models (LLMs). Through zero-shot, few-shot, and fine-tuning experiments on four public conversation datasets, we show that for replies to the initial turn of a dialog, an LLM with 64B parameters is able to accurately expand over 70% of phrases with abbreviation length up to 10, leading to an effective keystroke saving rate of up to 77% on these expansions. Including a small amount of context in the form of a single conversation turn more than doubles abbreviation expansion accuracies compared to having no context, an effect that is more pronounced for longer phrases. Additionally, the robustness of the models against typo noise can be enhanced through fine-tuning on noisy data. * equal contribution Related WorkAbbreviation expansion for text entry. Previous research on aiding text entry through AE used abbreviation schemes such as using only content words (Demasco and McCoy, 1992), discarding certain vowels and consonants (Shieber and Nelken, 2007), and flexible letter saving schemes (Pini et al., 2010;Adhikary et al., 2021;Gorman et al., 2021). Spontaneous abbreviations schemes primarily omit vowels, repeating consonants, last characters, and spaces, and lead to modest KSR (e.g., 25-40% in Willis et al. 2005, and 21% in Adhikary et al. 2021.) The low KSR of such schemes can be attributed to the implicit need for a human reader to decode the phrases without significant cognitive burden. N-gram models and neural language models (LMs) have been applied to expanding abbreviations for these relatively low-KSR schemes. By using LSTM models and context, Gorman et al.

show abstract

Section: Introductionmentioning

confidence: 99%

Context-Aware Abbreviation Expansion Using Large Language Models

Cai¹,

Venugopalan²,

Tomanek³

et al. 2022

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

show abstract

Using large language models to accelerate communication for eye gaze typing users with ALS

Cai,

Venugopalan,

Seaver

et al. 2024

Nat Commun

View full text Add to dashboard Cite

Accelerating text input in augmentative and alternative communication (AAC) is a long-standing area of research with bearings on the quality of life in individuals with profound motor impairments. Recent advances in large language models (LLMs) pose opportunities for re-thinking strategies for enhanced text entry in AAC. In this paper, we present SpeakFaster, consisting of an LLM-powered user interface for text entry in a highly-abbreviated form, saving 57% more motor actions than traditional predictive keyboards in offline simulation. A pilot study on a mobile device with 19 non-AAC participants demonstrated motor savings in line with simulation and relatively small changes in typing speed. Lab and field testing on two eye-gaze AAC users with amyotrophic lateral sclerosis demonstrated text-entry rates 29–60% above baselines, due to significant saving of expensive keystrokes based on LLM predictions. These findings form a foundation for further exploration of LLM-assisted text entry in AAC and other user interfaces.

show abstract

Investigating Speech Recognition for Improving Predictive AAC

Cited by 2 publications

References 15 publications

Context-Aware Abbreviation Expansion Using Large Language Models

Context-Aware Abbreviation Expansion Using Large Language Models

Using large language models to accelerate communication for eye gaze typing users with ALS

Contact Info

Product

Resources

About