“…First, very large training datasets are tipically required: e.g., hundreds of thousands of downlink training April 12, 2022 DRAFT samples are used in many of the aforementioned studies [9], [11], [16], [17], [19], [23], [26], [27], [31], [33], [35]. This problem has been recently addressed by training the DL architectures solely with uplink channel samples, assumed available at the BS [30], [18], [32], [36]. These works exploit the conjecture that learning at the uplink frequency can be transferred at the downlink frequency with no further modification, as proposed for the first time in [30] and validated in [36].…”