Very high resolution (VHR) images provide abundant ground details and spatial distribution information. Change detection in multi-temporal VHR images plays a significant role in urban expansion and area internal change analysis. Nevertheless, traditional change detection methods can neither take full advantage of spatial context information nor cope with the complex internal heterogeneity of VHR images. In this paper, we propose a powerful multi-scale feature convolution unit (MFCU) for change detection in VHR images. The proposed unit is able to extract multi-scale features in the same layer. Based on the proposed unit, two novel deep Siamese convolutional networks, deep Siamese multi-scale convolutional network (DSMS-CN) and deep Siamese multi-scale fully convolutional network (DSMS-FCN), are designed for unsupervised and supervised change detection in multi-temporal VHR images. For unsupervised change detection, we implement automatic pre-classification to obtain training patch samples, and the DSMS-CN fits the statistical distribution of changed and unchanged area from patch samples through multi-scale feature extraction module and deep Siamese architecture. For supervised change detection, the end-to-end deep fully convolutional network DSMS-FCN is trained in any size of multi-temporal VHR images, and directly output the binary change map. In addition, for the purpose of solving the inaccurate localization problem, the fully connected conditional random field (FC-CRF) is combined with DSMS-FCN to refine the results. The experimental results with challenging data sets confirm that the two proposed architectures perform better than the state-of-the-art methods.