With the rapid advancement of Intelligent Transportation Systems (ITS) and vehicular communications, Vehicular Edge Computing (VEC) is emerging as a promising technology to support low-latency ITS applications and services. In this paper, we consider the computation offloading problem from mobile vehicles/users in a heterogeneous VEC scenario, and focus on the networkand base station selection problems, where different networks have different traffic loads. In a fast-varying vehicular environment, computation offloading experience of users is strongly affected by the latency due to the congestion at the edge computing servers co-located with the base stations. However, as a result of the non-stationary property of such an environment and also information shortage, predicting this congestion is an involved task. To address this challenge, we propose an on-line learning algorithm and an off-policy learning algorithm based on multi-armed bandit theory. To dynamically select the least congested network in a piece-wise stationary environment, these algorithms predict the latency that the offloaded tasks experience using the offloading history. In addition, to minimize the task loss due to the mobility of the vehicles, we develop a method for base station selection. Moreover, we propose a relaying mechanism for the selected network, which operates based on the sojourn time of the vehicles. Through intensive numerical analysis, we demonstrate that the proposed learning-based solutions adapt to the traffic changes of the network by selecting the least congested network, thereby reducing the latency of offloaded tasks. Moreover, we demonstrate that the proposed joint base station selection and the relaying mechanism minimize the task loss in a vehicular environment.