As autonomous and connected vehicles are becoming a reality, mobile-edge computing (MEC) off-loading provides a promising paradigm to trade off between the long latency of clouding computing and the high cost of upgrading the on-board computers of vehicles. However, due to the randomness of task arrivals, vehicles always have a tendency to choose MEC server for offloading in a selfish way, which is not satisfactory for the social good of the whole system and even results in a failure possibility of some tasks due to the overflow of MEC servers. This paper elaborates the modeling of task arrival process and the influence of various offloading modes on computation cost. Interestingly, by formulating task arrivals as a compound process of vehicle arrivals and task generations, we found that the task arrival model for MEC servers does not belong to the standard Poisson distribution, which contradicts the popular assumption in most existing studies. Considering the load distribution and the prediction of cost, we propose a load-aware MEC offloading method, in which each vehicle makes MEC server selection based on the predicted cost with the updated knowledge on load distribution of MEC servers. Analysis and simulation show that the proposed scheme can achieve up to 65% reduction of total cost with almost 100% task success ratio. INDEX TERMS Mobile-edge computing, task arrival model, load-aware offloading, vehicular networks, load balance. I. INTRODUCTION With the development of the Internet of Vehicles (IoV), autonomous and connected vehicles are envisioned to provide a safer, greener [1], [2] and much more convenient [3], [4] transportation system for the public. Internet of vehicles are based on a core technology called Vehicular Ad Hoc Networks (VANETs) [5], [6], which is able to integrate the capabilities of wireless networks to vehicles. Vehicles may communicate with other vehicles directly in a vehicle-tovehicle fashion (V2V) [7], or communicate through road side units(RSUs), so called Vehicle-to-Infrastructure (V2I) communications [8]. IEEE 802 committee has defined wireless communication standard IEEE 802.11p [9] for V2I The associate editor coordinating the review of this manuscript and approving it for publication was Xu Chen. communication and the Federal Communications Commission allocated 75 MHz of bandwidth, which operates on 5.9 GHz channel for short range communications [10]. Due to the resource-limited on-board computers of vehicles, many potential applications, such as assistant accident avoidance [11]-[13], mobile crowd sensing [14] and augmented reality [15], which require significant computing power to process data generated by the vehicle sensors in a real-time fashion [16], pose a grand challenge to vehicular terminals. To offload the computation burden of vehicles, cloud computing [17], [18] has been adopted by gathering the computation tasks to central cloud servers, and sending back the results to local users. However, the long distance between cloud center and vehicles [19] results in long l...