Increasing amounts of public, corporate, and private speech data are now available on-line. These are limited in their usefulness, however, by the lack of tools to permit their browsing and search. The goal of our research is to provide tools to overcome the inherent difficulties of speech access, by supporting visual scanning, search, and information extraction. We describe a novel principle for the design of UIs to speech data: What You See Is Almost What You Hear (WYSIAWYH). In WYSIAWYH, automatic speech recognition (ASR) generates a transcript of the speech data. The transcript is then used as a visual analogue to that underlying data. A graphical user interface allows users to visually scan, read, annotate and search these transcripts. Users can also use the transcript to access and play specific regions of the underlying message. We first summarize previous studies of voicemail usage that motivated the WYSIAWYH principle, and describe a voicemail UI, SCANMail, that embodies WYSIAWYH. We report on a laboratory experiment and a two-month field trial evaluation. SCANMail outperformed a state of the art voicemail system on core voicemail tasks. This was attributable to SCANMail's support for visual scanning, search and information extraction. While the ASR transcripts contain errors, they nevertheless improve the efficiency of voicemail processing. Transcripts either provide enough information for users to extract key points or to navigate to important regions of the underlying speech, which they can then play directly.
No abstract
Conventional program guides present television shows in a list view, with metadata displayed in a separate window. However, this linear presentation style prevents users from fully exploring and utilizing the diverse, descriptive, and highly connected data associated with television programming. Additionally, despite the fact that program guides are the primary selection interface for television shows, few include integrated recommendation data to help users decide what to watch. iEPG presents a novel interface concept for navigating the multidimensional information space associated with television programming, as well as an effective visualization for displaying complex ratings data. Results from a user study indicate people appreciate the ability to search for content in non-linear ways and are receptive to recommendation systems and unconventional EPG visualizations.
We have implemented an infinite resolution multimedia sketchpad as a base for exploring a stream-of-consciousness model of computation where information creating, sharing and retrieval becomes so intuitive that the interface becomes invisible. Motivation to pursue this came from work on Pad [4], which can be thought of as a kind of traditional sketchpad or windows environment in the sense that it is a general-purpose substrate for visualizing two dimensional graphics and text. But Pad also supports the radical notion of being infinite in extent and resolution.We implemented Pad++ to explore smooth zooming for navigation and to serve as a platform for multimedia authoring and information visualization.The ability to work with very large datasets has been a primary design consideration in the development of Pad++.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.