Extended Reality (XR) experiences are on the verge of becoming widely adopted in diverse application domains. An essential part of the technology is accurate tracking and localization of the headset to create an immersive experience. A subset of the application domains requires perfect co-location between the real and the virtual world, where virtual objects are aligned with real-world counterparts. Current headsets support co-location for small areas, but suffer from drift when scaling up to larger ones such as buildings or factories. This paper proposes tools and solutions for this challenge by splitting up the simultaneous localization and mapping (SLAM) into separate mapping and localization stages. In the pre-processing stage, a feature map is built for the entire tracking area. A global optimizer is applied to correct the deformations caused by drift, guided by a sparse set of ground truth markers in the point cloud of a laser scan. Optionally, further refinement is applied by matching features between the ground truth keyframe images and their rendered-out SLAM estimates of the point cloud. In the second, real-time stage, the rectified feature map is used to perform localization and sensor fusion between the global tracking and the headset. The results show that the approach achieves robust co-location between the virtual and the real 3D environment for large and complex tracking environments.