“…Supervised learning is applied using the relative L1 loss and a physics-informed loss based on a discretized transient heat equation, inspired by [1]. The CNN architecture employs parallel branches with different dilations to enable the aggregation of far-field features [2].…”