“…Eigen et al [16,17] proposed a multi-stage, coarseto-fine network to estimate depth from a single image, and Laina et al [35] a fully convolutional architecture for depth estimation. Some works introduce attention mechanisms to achieve significant performance improvements [1,2,37,39]. Other works estimate depth by predicting probability distributions and discrete bins [5,40,59].…”