Proceedings of the 2007 Workshop on Performance Metrics for Intelligent Systems 2007
DOI: 10.1145/1660877.1660901
|View full text |Cite
|
Sign up to set email alerts
|

Evaluation of an integrated multi-task machine learning system with humans in the loop

Abstract: Abstract-Performance of a cognitive personal assistant, RADAR, consisting of multiple machine learning components, natural language processing, and optimization was examined with a test explicitly developed to measure the impact of integrated machine learning when used by a human user in a real world setting. Three conditions (conventional tools, Radar without learning, and Radar with learning) were evaluated in a large-scale, between-subjects study. The study revealed that integrated machine learning does pro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2007
2007
2018
2018

Publication Types

Select...
3
1
1

Relationship

1
4

Authors

Journals

citations
Cited by 13 publications
(11 citation statements)
references
References 9 publications
0
11
0
Order By: Relevance
“…This section provides an overview of the test; a complete description can be found elsewhere [19]. The evaluation was designed to present participants with a challenging email overload workload that satisfied the following criteria.…”
Section: The Conference Planning Testmentioning
confidence: 99%
See 3 more Smart Citations
“…This section provides an overview of the test; a complete description can be found elsewhere [19]. The evaluation was designed to present participants with a challenging email overload workload that satisfied the following criteria.…”
Section: The Conference Planning Testmentioning
confidence: 99%
“…The test emails included anonymized real emails and fabricated ones, the latter necessary in part to make the emails consistent with the simulated world [19]. A team of undergraduate English majors was employed to create a detailed backstory email corpus, independent messages detailing one or more tasks, and noise messages, which were unrelated to the conference.…”
Section: The Conference Planning Testmentioning
confidence: 99%
See 2 more Smart Citations
“…In fashion, the human component of algorithm evaluation is necessary [19,23]. Guided by this, we identify candidate applications where a fashion ontology enhanced with a better subjective data representation would likely be helpful.…”
Section: Fashion Ontology Use Casesmentioning
confidence: 99%