We study a general Markov game with metric switching costs: in each round, the player adaptively chooses one of several Markov chains to advance with the objective of minimizing the expected cost for at least 𝑘 chains to reach their target states. If the player decides to play a di erent chain, an additional switching cost is incurred. e special case in which there is no switching cost was solved optimally by Dumitriu, Tetali and Winkler [DTW03] by a variant of the celebrated Gi ins Index for the classical multi-armed bandit (MAB) problem with Markovian rewards [Git74, Git79]. However, for multi-armed bandit (MAB) with nontrivial switching cost, even if the switching cost is a constant, the classic paper by Banks and Sundaram [BS94] showed that no index strategy can be optimal. 1 In this paper, we complement their result and show there is a simple index strategy that achieves a constant approximation factor if the switching cost is constant and 𝑘 = 1. To the best of our knowledge, this is the rst index strategy that achieves a constant approximation factor for a general MAB variant with switching costs. For the general metric, we propose a more involved constant-factor approximation algorithm, via an nontrivial reduction to the stochastic 𝑘-TSP problem, in which a Markov chain is approximated by a random variable. Our analysis makes extensive use of various interesting properties of Gi ins index.