2000
DOI: 10.1017/s1351324900002503
|View full text |Cite
|
Sign up to set email alerts
|

Towards developing general models of usability with PARADISE

Abstract: The design of methods for performance evaluation is a major open research issue in the area of spoken language dialogue systems. In this paper we present the PARADISE methodology for developing predictive models of spoken dialogue performance, and then show how to evaluate the predictive power and generalizability of such models. To illustrate our methodology, we develop a number of models for predicting system usability (as measured by user satisfaction), based on the application of PARADISE to experimental d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
122
1

Year Published

2004
2004
2013
2013

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 176 publications
(125 citation statements)
references
References 22 publications
2
122
1
Order By: Relevance
“…The steps introduced below are all micro-evaluation in this scheme: although the evaluation context is broadened with each step, the effect of the isolated component is nevertheless what is evaluated. Although macro-evaluation of human-like systems is beyond the scope of this article, it is worth mentioning that the metrics we describe here could for example be used instead of or in parallel with the cost measures in PARADISE (Walker et al, 2000), with a resulting system-wide mapping between human-likeness, task success, and user satisfaction. Developing and evaluating a component for increased human-likeness can be described as a multi-step process, where the component is tested in an increasingly broad context.…”
Section: Evaluation Contextmentioning
confidence: 99%
See 2 more Smart Citations
“…The steps introduced below are all micro-evaluation in this scheme: although the evaluation context is broadened with each step, the effect of the isolated component is nevertheless what is evaluated. Although macro-evaluation of human-like systems is beyond the scope of this article, it is worth mentioning that the metrics we describe here could for example be used instead of or in parallel with the cost measures in PARADISE (Walker et al, 2000), with a resulting system-wide mapping between human-likeness, task success, and user satisfaction. Developing and evaluating a component for increased human-likeness can be described as a multi-step process, where the component is tested in an increasingly broad context.…”
Section: Evaluation Contextmentioning
confidence: 99%
“…The PARADISE evaluation scheme (Walker et al, 2000), for example, calls for a success/failure judgement to be made by a participant after each information transaction -something that could easily become very disruptive. Still, the responses need to be collected during or shortly after the interaction occurs, lest it turn into a memory test.…”
Section: Human Judgementmentioning
confidence: 99%
See 1 more Smart Citation
“…These aspects can be used as the basis for usability evaluation strategies. Many frameworks and methodologies have been developed and used for evaluation of spoken dialogue systems in recent works [8,9,13,15,17,18,21,24,30].…”
Section: 12mentioning
confidence: 99%
“…1 http://www.cocosda.org/ address the problem of predicting system usability and user satisfaction from measurable performance criteria. This is the case of the PARADISE framework (Walker et al, 2000a), which has become one of the reference frameworks for system evaluation.…”
Section: Cocosda (Coordinating Committee On Speech Databases and Speementioning
confidence: 99%