Proceedings of the Third Conference on Applied Natural Language Processing - 1992
DOI: 10.3115/974499.974529
|View full text |Cite
|
Sign up to set email alerts
|

A practical methodology for the evaluation of spoken language systems

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

1996
1996
2007
2007

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 14 publications
0
7
0
Order By: Relevance
“…They will be manageable if interruption points can be discerned in the real-time using acoustic information (Nakatani and Hirschberg, 1994 Parsing Utterances Including Self-Repairs It is relatively easy to evaluate technologies such as morphological analysis and information retrieval in objecive and empirical terms, because unique solutions can be defined for such tasks. Such an evaluation will be almost necessarily a blackbox evaluation, such as in ATIS (Boisen and Bates, 1992), TREC (Harman, 1995), MUC (MUC, 1991), and so forth. In order to advance researches on dialogue systems, there should be some empirical method for evaluating them.…”
Section: Discussionmentioning
confidence: 99%
“…They will be manageable if interruption points can be discerned in the real-time using acoustic information (Nakatani and Hirschberg, 1994 Parsing Utterances Including Self-Repairs It is relatively easy to evaluate technologies such as morphological analysis and information retrieval in objecive and empirical terms, because unique solutions can be defined for such tasks. Such an evaluation will be almost necessarily a blackbox evaluation, such as in ATIS (Boisen and Bates, 1992), TREC (Harman, 1995), MUC (MUC, 1991), and so forth. In order to advance researches on dialogue systems, there should be some empirical method for evaluating them.…”
Section: Discussionmentioning
confidence: 99%
“…Boisen and Bates developed a methodology based on the collective experiences of BBN's participation in the DARPA projects [15]. Their methodology analyzed many domain specific evaluation methods to create a general framework to characterize the evaluation of dialogue systems.…”
Section: Evaluating Dialogue Systemsmentioning
confidence: 99%
“…The recognition work, combined with the work in this chapter, are ultimately designed to create a complete running CopyCat system for deployment and testing with live children. There has been a great amount of research on evaluating live system testing for dialogue systems that have been designed using Wizard of Oz systems [15,66,72,114,148]. There is a tension between domain specific criteria, intermediary evaluation and metrics, human judgement, and the input / output mapping of the final system.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Other evaluations in the same tradition include the ATIS [2] and TREC [10] evaluations, the first in the domain of database query, emphasizing a spoken language component, the second in the domain of text retrieval. A third set of evaluations-the MUC evaluations of fact extraction systems-is reported by Cowie and Lehnert in this issue and is reported in detail in [18].The test material for these other evaluations in the ARPA tradition is, however, critically different.…”
Section: Some Past Evaluationsmentioning
confidence: 99%