Despite considerable work in automatic meeting summarization over the last few years, comparing results remains difficult due to varied task conditions and evaluations. To address this issue, we present a method for determining the best possible extractive summary given an evaluation metric like ROUGE. Our oracle system is based on a knapsack-packing framework, and though NP-Hard, can be solved nearly optimally by a genetic algorithm. To frame new research results in a meaningful context, we suggest presenting our oracle results alongside two simple baselines. We show oracle and baseline results for a variety of evaluation scenarios that have recently appeared in this field.