Can we map the channels at one set of antennas and one frequency band to the channels at another set of antennaspossibly at a different location and a different frequency band? If this channel-to-channel mapping is possible, we can expect dramatic gains for massive MIMO systems. For example, in FDD massive MIMO, the uplink channels can be mapped to the downlink channels or the downlink channels at one subset of antennas can be mapped to the downlink channels at all the other antennas. This can significantly reduce (or even eliminate) the downlink training/feedback overhead. In the context of cellfree/distributed massive MIMO systems, this channel mapping can be leveraged to reduce the fronthaul signaling overhead as only the channels at a subset of the distributed terminals need to be fed to the central unit which can map them to the channels at all the other terminals. This mapping can also find interesting applications in mmWave beam prediction, MIMO radar, and massive MIMO based positioning.In this paper, we introduce the new concept of channel mapping in space and frequency, where the channels at one set of antennas and one frequency band are mapped to the channels at another set of antennas and frequency band. First, we prove that this channel-to-channel mapping function exists under the condition that the mapping from the candidate user positions to the channels at the first set of antennas is bijective; a condition that can be achieved with high probability in several practical MIMO communication scenarios. Then, we note that the channel-to-channel mapping function, even if it exists, is typically unknown and very hard to characterize analytically as it heavily depends on the various elements of the surrounding environment. With this motivation, we propose to leverage the powerful learning capabilities of deep neural networks to learn (approximate) this complex channel mapping function. For a case study of distributed/cell-free massive MIMO system with 64 antennas, the results show that acquiring the channels at only 4-8 antennas can be efficiently mapped to the channels at all the 64 distributed antennas, even if the 64 antennas are at a different frequency band. Further, the 3D ray-tracing based simulations show that the achievable rates with the predicted channels achieve near-optimal data rates when compared to the upper bound with perfect channel knowledge. This highlight a novel solution for reducing the training and feedback overhead in mmWave and massive MIMO systems thanks to the powerful learning capabilities of deep neural networks.