We study a dynamic and stochastic knapsack problem in which a decision maker is sequentially presented with items arriving according to a Bernoulli process over n discrete time periods. Items have equal rewards and independent weights that are drawn from a known nonnegative continuous distribution F. The decision maker seeks to maximize the expected total reward of the items that the decision maker includes in the knapsack while satisfying a capacity constraint and while making terminal decisions as soon as each item weight is revealed. Under mild regularity conditions on the weight distribution F, we prove that the regret—the expected difference between the performance of the best sequential algorithm and that of a prophet who sees all of the weights before making any decision—is, at most, logarithmic in n. Our proof is constructive. We devise a reoptimized heuristic that achieves this regret bound.
Abstract. Given a sequence of n independent random variables with common continuous distribution, we propose a simple adaptive online policy that selects a monotone increasing subsequence. We show that the expected number of monotone increasing selections made by such a policy is within O(log n) of optimal. Our construction provides a direct and natural way for proving the O(log n)-optimality gap. An earlier proof of the same result made crucial use of a key inequality of Bruss and Delbaen (2001) and of de-Poissonization.
We study a dynamic and stochastic knapsack problem in which a decision maker is sequentially presented with items arriving according to a Bernoulli process over n discrete time periods. Items have equal rewards and independent weights that are drawn from a known non-negative continuous distribution F . The decision maker seeks to maximize the expected total reward of the items that she includes in the knapsack while satisfying a capacity constraint, and while making terminal decisions as soon as each item weight is revealed.Under mild regularity conditions on the weight distribution F , we prove that the regret-the expected difference between the performance of the best sequential algorithm and that of a prophet who sees all of the weights before making any decision-is, at most, logarithmic in n. Our proof is constructive. We devise a re-optimized heuristic that achieves this regret bound.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.