2002
DOI: 10.1613/jair.935
|View full text |Cite
|
Sign up to set email alerts
|

A Critical Assessment of Benchmark Comparison in Planning

Abstract: Recent trends in planning research h a v e led to empirical comparison becoming commonplace. The eld has started to settle into a methodology for such comparisons, which for obvious practical reasons requires running a subset of planners on a subset of problems. In this paper, we characterize the methodology and examine eight implicit assumptions about the problems, planners and metrics used in many of these comparisons. The problem assumptions are: PR1 the performance of a general purpose planner should not b… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
46
0

Year Published

2011
2011
2018
2018

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 42 publications
(48 citation statements)
references
References 21 publications
2
46
0
Order By: Relevance
“…This step could be redundant in some portfolio structures: n-of-n design does not require any selection. The configured portfolio includes all the incorporated planners, and is based on the hypothesis that typically planners either solve a problem quickly or not at all [25]. This strategy is reasonable when all the following hold:…”
Section: Planner Selectionmentioning
confidence: 99%
See 2 more Smart Citations
“…This step could be redundant in some portfolio structures: n-of-n design does not require any selection. The configured portfolio includes all the incorporated planners, and is based on the hypothesis that typically planners either solve a problem quickly or not at all [25]. This strategy is reasonable when all the following hold:…”
Section: Planner Selectionmentioning
confidence: 99%
“…Reasonably, this means that each planner should run for at least a few tens of seconds: this is because, according to [25], if a planner does not find a solution quickly, it will not find it at all. Finally, we would emphasise that if the target of the portfolio is minimising the runtime, including all the incorporated planners will possibly make it hard to effectively order them.…”
Section: Planner Selectionmentioning
confidence: 99%
See 1 more Smart Citation
“…This step could be useless in some portfolio structures: n-of-n (ArvandHerd and one of the configuration strategies included in FDSS2) design does not require any selection. The configured portfolio includes all the incorporated planners, and is based on the hypothesis that typical planners either solve a problem quickly or not at all [11]. This strategy is reasonable when: (i) the number of incorporated planners is limited; (ii) all the incorporated planners have really good mean performances; (iii) the maximum amount of CPU time for solving a problem is quite large and, (iv) the target of the portfolio is not minimizing the runtime.…”
Section: Planners Selectionmentioning
confidence: 99%
“…One of the few past approaches towards the direction of adaptive planning is the BUS system (Howe & Dahlman, 1993;Howe et al, 1999). BUS runs six state-of-the-art planners, namely STAN, IPP, SGP, BlackBox, UCPOP and PRODIGY, using a round-robin schema, until one of them finds a solution.…”
Section: Learning Domain Knowledgementioning
confidence: 99%