IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, including reprinting/republishing this material for advertising or promotional purposes, collecting new collected works for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Abstract-The provision of reliable connectivity is envisioned as a key enabler for future autonomous driving. Anticipatory communication techniques have been proposed for proactively considering the properties of the highly dynamic radio channel within the communication systems themselves. Since real world experiments are highly time-consuming and lack a controllable environment, performance evaluations and parameter studies for novel anticipatory vehicular communication systems are typically carried out based on network simulations. However, due to the required simplifications and the wide range of unknown parameters (e.g., Mobile Network Operator (MNO)-specific configurations of the network infrastructure), the achieved results often differ significantly from the behavior in real world evaluations. In this paper, we present Data-driven Network Simulation (DDNS) as a novel data-driven approach for analyzing and optimizing anticipatory vehicular communication systems. Different machine learning models are combined for achieving a close to reality representation of the analyzed system's behavior. In a proof of concept evaluation focusing on opportunistic vehicular data transfer, the proposed method is validated against field measurements and system-level network simulation. In contrast to the latter, DDNS does not only provide massively faster result generation, it also achieves a significantly better representation of the real world behavior due to implicit consideration of crosslayer dependencies by the machine learning approach.The authors are with Communication Networks Institute, TU Dortmund University, be brought to the next performance level. The exploitation of machine learning offers the potential to be the catalyst for this development [5], as its inherent strength is to leverage hidden interdependencies between measurable variables, which are mostly too complex to be covered in an analytical solution.The development process of these novel anticipatory vehicular communication systems confronts researchers and engineers with a methodological dilemma: While the most accurate estimations for the future real world performance can be achieved by performing real world experiments, this approach is highly time consuming and lacks a controllable environment. In fact, it is practically impossible to guarantee fairness by evaluating different methods under the exact same network conditions. System-level network simulation based on Discrete Event Simulation (DES) has emerged as the most commonly used scientific method to analyze mobile communication systems [6], due to its capability of solving both issues. However, the necessary model simplifications reduce the significance of...