Multi-sensor remote sensing applications require the registration of optical and Synthetic Aperture Radar (SAR) images, which presents challenges due to significant radiometric and geometric differences resulting from distinct imaging mechanisms. Although various algorithms have been proposed, including hand-crafted features and deep learning networks, most of them focus on matching radiometric-invariant features while ignoring geometric differences. Furthermore, these algorithms often achieve promising results on datasets that use manually labeled ground truths that may be less reliable for high-resolution SAR images affected by speckle noise. To address these issues, we propose a robust global-to-local registration algorithm consisting of four modules: geocoding, global matching, local matching, and refinement. We generate a geometry-invariant mask in the geocoding module to help the local matching module focus on valid areas, introduce a fast global matching method to solve large offsets, and use matching confidence to guide subsequent local matching based on the accuracy of global matching. We propose a feature based on multi-directional anisotropic Gaussian derivatives (MAGD) and embed it into the confidence-aware local matching with the geometry-invariant mask to reduce the effect of geometric differences. Finally, we refine correspondence positions and remove outliers. We also build a high-accuracy evaluation dataset with hundreds of image pairs, where the ground truth is obtained by meta poles, which have clear and reliable structures in both optical and SAR images. Experimental results on this dataset demonstrate the superiority of our proposed algorithm compared to several state-of-the-art methods.