“…Baselines: Cross-view geo-localization (CVGL) has garnered significant research interest, resulting in several impressive works emerging in the field. To demonstrate the superiority of our proposed method, we selected 17 strong baselines and state-of-theart methods in total, i.e., Workman et al [9], Vo et al [10], Zhai et al [71], Cross-View Matching Network (CVM-Net) [11], Liu et al [31], Regmi et al [12], Spatial-Aware Feature Aggregation network (SAFA) [23], Cross-View Feature Transport technique (CVFT) [24], Dynamic Similarity Matching network (DSM) [73], Toker et al [41], Layer-to-Layer Transformer (L2LTR) [14], Local Pattern Network (LPN) [26], Unit SAFA + Subtraction Attention Module (USAM) [74], LPN + USAM [74], pure transformer-based geo-localization (Trans-Geo) [13], Transformer-Guided Convolutional Neural Network (TransGCNN) [25], and LPN + Dynamic Weighted Decorrelation Regularization (DWDR) [27]. In particular, for omnidirectional comparison, we use their recommended settings for training.…”