Abstract. Large-scale Digital Surface Model (DSM) generated with high-resolution satellite images (HRSI) are comparable, cheaper, and more accessible when comparing to Light Detection and Ranging (LiDAR) data and aerial remotely sensed images. Several photogrammetric commercial/open-source software packages are being developed for satellite image-based 3D reconstruction, in which, most of them adopt a modified version of Semi-Global Matching (SGM) algorithm for dense image matching. With the continuous development of matching cost computation methods, the existing methods can be divided into classical (low-level) and learning-based algorithms (non-end-to-end learning and end-to-end learning methods). On Middlebury and KITTI datasets, learning-based algorithms has shown their superiority compared to SGM derived methods. In this context, we assume that matching cost is the key factor of DIM. This paper reviews and evaluates Census Transform, and MC-CNN on a WorldView-3 typical city scene satellite stereo images on the premise that the overall SGM framework remains unchanged, providing a preliminary comparison for academic and industrial. We first compute the cost valume of these two methods, obtains the final DSM after semi-global optimization, and compares their gemetric accuracy with the corresponding LiDAR derived ground truth. We presented our comparison and findings in the experimental section.