In the secretary problem of Cayley (1875) and Moser (1956), n non-negative, independent, random variables with common distribution are sequentially presented to a decision maker who decides when to stop and collect the most recent realization. The goal is to maximize the expected value of the collected element.In the k-choice variant, the decision maker is allowed to make k ď n selections to maximize the expected total value of the selected elements. Assuming that the values are drawn from a known distribution with finite support, we prove that the best regret-the expected gap between the optimal online policy and its offline counterpart in which all n values are made visible at time 0-is uniformly bounded in the number of candidates n and the budget k. Our proof is constructive: we develop an adaptive Budget-Ratio policy that achieves this performance. The policy selects or skips values depending on where the ratio of the residual budget to the remaining time stands relative to multiple thresholds that correspond to middle points of the distribution. We also prove that being adaptive is crucial: in general, the minimal regret among non-adaptive policies grows like the square root of n. The difference is the value of adaptiveness.
Abstract. Consider a sequence of n independent random variables with a common continuous distribution F , and consider the task of choosing an increasing subsequence where the observations are revealed sequentially and where an observation must be accepted or rejected when it is first revealed. There is a unique selection policy π * n that is optimal in the sense that it maximizes the expected value of Ln(π * n ), the number of selected observations. We investigate the distribution of Ln(π * n ); in particular, we obtain a central limit theorem for Ln(π * n ) and a detailed understanding of its mean and variance for large n. Our results and methods are complementary to the work of Bruss and Delbaen (2004) where an analogous central limit theorem is found for monotone increasing selections from a finite sequence with cardinality N where N is a Poisson random variable that is independent of the sequence.
We study the hiring and retention of heterogeneous workers who learn over time. We show that the problem can be analyzed as an infinite-armed bandit with switching costs, and we apply results from Bergemann and Välimäki [Bergemann D, Välimäki J (2001) Stationary multi-choice bandit problems. J. Econom. Dynam. Control 25(10):1585–1594] to characterize the optimal hiring and retention policy. For problems with Gaussian data, we develop approximations that allow the efficient implementation of the optimal policy and the evaluation of its performance. Our numerical examples demonstrate that the value of active monitoring and screening of employees can be substantial. This paper was accepted by Yossi Aviv, operations management.
We study the behavior of strategic customers in an open-routing service network with multiple stations. When a customer enters the network, she is free to choose the sequence of stations that she visits, with the objective of minimizing her expected total system time. We propose a two-station game with all customers present at the start of service and deterministic service times, and we find that strategic customers "herd," that is, in equilibrium all customers choose the same route. For unobservable systems, we prove that the game is supermodular, and we then identify a broad class of learning rules-which includes both fictitious play and Cournot best response-that converges to herding in finite time. By combining different theoretical and numerical analyses, we find that the herding behavior is prevalent in many other congested open-routing service networks, including those with arrivals over time, those with stochastic service times, and those with more than two stations. We also find that the system under herding performs very close to the first-best outcome in terms of cumulative system time.
Abstract. Given a sequence of independent random variables with a common continuous distribution, we consider the online decision problem where one seeks to minimize the expected value of the time that is needed to complete the selection of a monotone increasing subsequence of a prespecified length n. This problem is dual to some online decision problems that have been considered earlier, and this dual problem has some notable advantages. In particular, the recursions and equations of optimality lead with relative ease to asymptotic formulas for mean and variance of the minimal selection time.Mathematics Subject Classification (2010): Primary: 60C05, 90C40; Secondary: 60G40, 90C27, 90C39Key Words: Increasing subsequence problem, online selection, sequential selection, time-focused decision problem, dynamic programming, Markov decision problem. Increasing Subsequences and Time-Focused SelectionIf X 1 , X 2 , . . . is a sequence of independent random variables with a common continuous distribution F , then Here we consider a new kind of decision problem where one seeks to select as quickly as possible an increasing subsequence of a prespecified length n. More precisely, at time i, when the decision maker is first presented with X i , a decision
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.