Nowadays, deep learning (DL) finds application in a large number of scientific fields, among which the estimation and the enhancement of signals disrupted by the noise of different natures. In this article, we address the problem of the estimation of the interferometric parameters from synthetic aperture radar (SAR) data. In particular, we combine convolutional neural networks together with the concept of residual learning to define a novel architecture, named-Net, for the joint estimation of the interferometric phase and coherence.-Net is trained using synthetic data obtained by an innovative strategy based on the theoretical modeling of the physics behind the SAR acquisition principle. This strategy allows the network to generalize the estimation problem with respect to: 1) different noise levels; 2) the nature of the imaged target on the ground; and 3) the acquisition geometry. We then analyze the-Net performance on an independent data set of synthesized interferometric data, as well as on real InSAR data from the TanDEM-X and Sentinel-1 missions. The proposed architecture provides better results with respect to state-of-the-art InSAR algorithms on both synthetic and real test data. Finally, we perform an application-oriented study on the retrieval of the topographic information, which shows that-Net is a strong candidate for the generation of high-quality digital elevation models at a resolution close to the one of the original single-look complex data.