Computer voice is experiencing a renaissance through the growing popularity of voice-based interfaces, agents, and environments. Yet, how to measure the user experience (UX) of voice-based systems remains an open and urgent question, especially given that their form factors and interaction styles tend to be non-visual, intangible, and often considered disembodied or "body-less. " As a frst step, we surveyed the ACM and IEEE literatures to determine which quantitative measures and measurements have been deemed important for voice UX. Our fndings show that there is little consensus, even with similar situations and systems, as well as an overreliance on lab work and unvalidated scales. In response, we ofer two high-level descriptive frameworks for guiding future research, developing standardized instruments, and informing ongoing review work. Our work highlights the current strengths and weaknesses of voice UX research and charts a path towards measuring voice UX in a more comprehensive way.
CCS CONCEPTS• Human-centered computing → Human computer interaction (HCI); Human computer interaction (HCI); Interaction devices; Sound-based input / output.