Vision based depth estimation plays a significant role in Intelligent Transportation Systems (ITS) because of its low cost and high efficiency, which can be used to analyze driving environment, improve driving safety, etc. Although recently proposed approaches abandon time consuming pre-processing or post-processing steps and achieve an end-to-end prediction manner, fine details may be lost through max-pooling based encode modules. To tackle this problem, we propose Multi-Scale Dilated Convolution Network (MSDC-Net), a dilated convolution based deep network. For the feature encoding and decoding part, dilated layers maintain the scale of original image and reduce lost details. After that, a pyramid dilated feature extraction module is added to integrate the knowledge learned through forward steps with different receptive fields. The proposed approach is evaluated on KITTI dataset, and achieves a state-of-the-art result on the dataset.INDEX TERMS Depth estimation, ResNet, dilated network, multi-scale dilated module, intelligent transportation systems (ITS).