1994
DOI: 10.1007/bf00993983
|View full text |Cite
|
Sign up to set email alerts
|

Statistical methods for analyzing speedup learning experiments

Abstract: Abstract. Speedup learning systems are typically evaluated by comparing their impact on a problem solver's performance. The impact is measured by running the problem solver, before and after learning, on a sample of problems randomly drawn from some distribution. Often, the experimenter imposes a bound on the CPU time the problem solver is allowed to spend on any individual problem. Segre et al. (1991) argue that the experimenter's choice of time bound can bias the results of the experiment. To address this pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
18
0

Year Published

1994
1994
2011
2011

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 17 publications
(18 citation statements)
references
References 12 publications
0
18
0
Order By: Relevance
“…Segre et al (1991) demonstrated that small captimes can lead to misleading conclusions when evaluating explanation-based learning algorithms. Etzioni and Etzioni (1994) extended statistical tests to deal with partially-censored runs in an effort to limit the large impact of captimes observed by Segre et al (1991). Simon and Chatalic (2001) demonstrated the relative robustness of comparisons between SAT solvers for three different captimes.…”
Section: Introductionmentioning
confidence: 99%
“…Segre et al (1991) demonstrated that small captimes can lead to misleading conclusions when evaluating explanation-based learning algorithms. Etzioni and Etzioni (1994) extended statistical tests to deal with partially-censored runs in an effort to limit the large impact of captimes observed by Segre et al (1991). Simon and Chatalic (2001) demonstrated the relative robustness of comparisons between SAT solvers for three different captimes.…”
Section: Introductionmentioning
confidence: 99%
“…For Claim 2 and Claim 3, we compute statistical significance tests to support our results. Note that Segre et al [1991] argued that time bound experiment can bias the result of the evaluation and Etzioni and Etzioni [1994] extensively discussed some possible solutions. In our case, the time bound is applied to both the Dapper/Pipe and Karma results.…”
Section: Resultsmentioning
confidence: 99%
“…The unoptimized CSP program was by far the least efficient. These conclusions regarding the relative efficiencies of the four programs can be justified statistically using the methodology proposed by Etzioni and Etzioni (1993). Specifically, any pairwise comparison using a simple sign test on the four programs' completion times (on all 100 test instances) is statistically significant with p _< .05.…”
Section: Comparison With Hand-coded Algorithmsmentioning
confidence: 99%