Learning any-to-any (A2A) path loss maps, where the objective is the reconstruction of path loss between any two given points in a map, might be a key enabler for many applications that rely on device-to-device (D2D) communication. Such applications include machine-type communications (MTC) or vehicle-to-vehicle (V2V) communications. Current approaches for learning A2A maps are either model-based methods, or pure data-driven methods. Model-based methods have the advantage that they can generate reliable estimations with low computational complexity, but they cannot exploit information coming from data. Pure data-driven methods can achieve good performance without assuming any physical model, but their complexity and their lack of robustness is not acceptable for many applications. In this paper, we propose a novel hybrid model and data-driven approach that fuses information obtained from datasets and models in an online fashion. To that end, we leverage the framework of stochastic learning to deal with the sequential arrival of samples and propose an online algorithm that alternatively and sequentially minimizes the original nonconvex problem. A proof of convergence is presented, along with experiments based firstly on synthetic data, and secondly on a more realistic dataset for V2X, with both experiments showing promising results.