Mobile-edge computing (MEC) offloads computational tasks from wireless devices to network edge, and enables realtime information transmission and computing. Most existing work concerns a small-scale synchronous MEC system. In this paper, we focus on a large-scale asynchronous MEC system with random task arrivals, distinct workloads, and diverse deadlines. We formulate the offloading policy design as a restless multi-armed bandit (RMAB) to maximize the total discounted reward over the time horizon. However, the formulated RMAB is related to a PSPACEhard sequential decision-making problem, which is intractable. To address this issue, by exploiting the Whittle index (WI) theory, we rigorously establish the WI indexability and derive a scalable closedform solution. Consequently, in our WI policy, each user only needs to calculate its WI and report it to the BS, and the users with the highest indices are selected for task offloading. Furthermore, when the task completion ratio becomes the focus, the shorter slack time less remaining workload (STLW) priority rule is introduced into the WI policy for performance improvement. When the knowledge of user offloading energy consumption is not available prior to the offloading, we develop Bayesian learning-enabled WI policies, including maximum likelihood estimation, Bayesian learning with conjugate prior, and prior-swapping techniques. Simulation results show that the proposed policies significantly outperform the other existing policies.