The lack of adequate stereo coverage and where available, lengthy processing time, various artefacts, and unsatisfactory quality and complexity of automating the selection of the best set of processing parameters, have long been big barriers for large-area planetary 3D mapping. In this paper, we propose a deep learning-based solution, called MADNet (Multi-scale generative Adversarial u-net with Dense convolutional and up-projection blocks), that avoids or resolves all of the above issues. We demonstrate the wide applicability of this technique with the ExoMars Trace Gas Orbiter Colour and Stereo Surface Imaging System (CaSSIS) 4.6 m/pixel images on Mars. Only a single input image and a coarse global 3D reference are required, without knowing any camera models or imaging parameters, to produce high-quality and high-resolution full-strip Digital Terrain Models (DTMs) in a few seconds. In this paper, we discuss technical details of the MADNet system and provide detailed comparisons and assessments of the results. The resultant MADNet 8 m/pixel CaSSIS DTMs are qualitatively very similar to the 1 m/pixel HiRISE DTMs. The resultant MADNet CaSSIS DTMs display excellent agreement with nested Mars Reconnaissance Orbiter Context Camera (CTX), Mars Express’s High-Resolution Stereo Camera (HRSC), and Mars Orbiter Laser Altimeter (MOLA) DTMs at large-scale, and meanwhile, show fairly good correlation with the High-Resolution Imaging Science Experiment (HiRISE) DTMs for fine-scale details. In addition, we show how MADNet outperforms traditional photogrammetric methods, both on speed and quality, for other datasets like HRSC, CTX, and HiRISE, without any parameter tuning or re-training of the model. We demonstrate the results for Oxia Planum (the landing site of the European Space Agency’s Rosalind Franklin ExoMars rover 2023) and a couple of sites of high scientific interest.