2006
DOI: 10.21236/ada457300
|View full text |Cite
|
Sign up to set email alerts
|

The RADAR Test Methodology: Evaluating a Multi-Task Machine Learning System with Humans in the Loop

Abstract: The RADAR project involves a collection of machine learning research thrusts that are integrated into a cognitive personal assistant. Progress is examined with a test developed to measure the impact of learning when used by a human user. Three conditions (conventional tools, Radar without learning, and Radar with learning) are evaluated in a large-scale, betweensubjects study. This paper describes the RADAR Test with a focus on test design, test harness development, experiment execution, and analysis. Results … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2007
2007
2016
2016

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 11 publications
0
3
0
Order By: Relevance
“…To investigate these questions, the RADAR team carried out extensive experimental evaluation. The details of the evaluation are reported by others in [18]; here, we give a summary.…”
Section: Discussionmentioning
confidence: 99%
“…To investigate these questions, the RADAR team carried out extensive experimental evaluation. The details of the evaluation are reported by others in [18]; here, we give a summary.…”
Section: Discussionmentioning
confidence: 99%
“…The participants' primary email task is to read provided emails about an upcoming academic conference and consolidate all the changes that need to be made to the conference schedule and website [27]. They were given a spreadsheet with information about conference speakers, sessions, and talks, and asked to make changes to it based on change requests in the email, in 12 minutes.…”
Section: User Labels -Physical Activity Coachmentioning
confidence: 99%
“…They were given a spreadsheet with information about conference speakers, sessions, and talks, and asked to make changes to it based on change requests in the email, in 12 minutes. The emails and task were modified from the RADAR dataset [27]. The emails in the data set were labeled with a folder name, which was removed to test the participants.…”
Section: User Labels -Physical Activity Coachmentioning
confidence: 99%