Multisource data are captured from different sensors or generated with different generation mechanisms. Ground camera images (images taken from ground-based camera) and rendered images (synthesized by the position information from 3D image-based point cloud) are different-source geospatial data, called cross-domain images. Particularly, in outdoor environments, the registration relationship between the above crossdomain images is available to establish the spatial relationship between 2D and 3D space, which is an indirect solution for virtual-real registration of Augmented Reality (AR). However, the traditional handcrafted feature descriptors cannot match the above cross-domain images because of the low quality of rendered images and the domain gap between cross-domain images. In this paper, inspired by the success achieved by deep learning in computer vision, we first propose an end-to-end network, DIFD-Net, to learn Domain Invariant Feature Descriptors (DIFDs) for cross-domain image patches. The DIFDs are used for crossdomain image patch retrieval to the registration of ground camera and rendered images. Second, we construct a domain-kept consistent loss function, which balances the feature descriptors for narrowing the gap in different domains, to optimize DIFD-Net. Specially, the negative samples are generated from positive during training, and the introduced constraint of intermediate feature maps increases extra supervision information to learn feature descriptors. Finally, experiments show the superiority of DIFDs for the retrieval of cross-domain image patches, which achieves state-of-the-art retrieval performance. Additionally, we use DIFDs to match ground camera images and rendered images, and verify the feasibility of the derived AR virtual-real registration in open outdoor environments. Index Terms-Domain Invariant Feature Descriptor (DIFD), multisource remote sensing data, cross-domain image, image patch matching, augmented reality, virtual-real registration.