Abstract. Speedup learning systems are typically evaluated by comparing their impact on a problem solver's performance. The impact is measured by running the problem solver, before and after learning, on a sample of problems randomly drawn from some distribution. Often, the experimenter imposes a bound on the CPU time the problem solver is allowed to spend on any individual problem. Segre et al. (1991) argue that the experimenter's choice of time bound can bias the results of the experiment. To address this problem, we present statistical hypothesis tests specifically designed to analyze speedup data and eliminate this bias. We apply the tests to the data reported by Etzioni (1990a) and show that most (but not all) of the speedups observed are statistically significant.Keywords. speedup learning, statistics, explanation-based learning, experimental methodology
MotivationSpeedup learning systems are systems that automatically generate search-control knowledge (e.g., Etzioni, 1990b;Knoblock, 1990;Minton, 1988a;Mooney, 1989;O'Rorke, 1989;Shavlik, 1990). The effectiveness of a speedup learning system is typically evaluated by comparing the performance of a problem solver, guided by the learned knowledge, with the performance of the problem solver given no control knowledge, or given control knowledge acquired by a different learning system. The problem solver is run on a sample of problems randomly drawn from some distribution. In many experiments, the problem solver requires an inordinately long time to solve one or more of the problems due to the combinatorial nature of its search. To allow the experiments to complete in reasonable time, the experimenter imposes a bound on the CPU time that the problem solver is allowed to spend on any individual problem. When that bound is exceeded, the problem is marked "unsolved" and the problem solver moves on to the next problem. The same time boundThe statistical tests described in this article are encoded as COMMON LISP routines. The routines, and the data analyzed in the article, are available by sending mail to ETZIONI@CS. WASHINGTON. EDU. We hope that other researchers will use the routines to validate their own speedup learning experiments.