The demand for seamless Internet access under extreme user mobility, such as on high-speed trains and vehicles, has become a norm rather than an exception. However, the 4G/5G mobile network is not always reliable to meet this demand, with non-negligible failures during the handover between base stations. A fundamental challenge of reliability is to balance the exploration of more measurements for satisfactory handover, and exploitation for timely handover (before the fast-moving user leaves the serving base station's radio coverage). This paper formulates this trade-off in extreme mobility as a composition of two distinct multi-armed bandit problems. We propose Bandit and Threshold Tuning (BaTT) to minimize the regret of handover failures in extreme mobility. BaTT uses -binary-search to optimize the threshold of the serving cell's signal strength to initiate the handover procedure with O(log J log T ) regret. It further devises opportunistic Thompson sampling, which optimizes the sequence of the target cells to measure for reliable handover with O(log T ) regret. Our experiment over a real LTE dataset from Chinese high-speed rails validates significant regret reduction and a 29.1% handover failure reduction.