In the age of spatial computing, computer vision is central, and efficient segmentation of 3D scan data becomes a fundamental task. Existing segmentation methods are often locked to specific AI models, lack level-of-detail (LoD) capabilities, and do not support efficient incremental segmentation. These limitations hinder their application to XR systems that integrate architectural and urban scales, which demand both at scale and detailed, up-to-date segmentation information, while leveraging limited local hardware in distributed computing environments.In this work, we present a novel framework that integrates multiple 2D AI through AI-agnostic 3D geometry feature fusion, ensuring spatial consistency while taking advantage of the rapid advancements in 2D AI models. Our framework performs LoD segmentation, enabling swift segmentation of downsampled geometry and full detail on needed segments. Additionally, it progressively builds a segmentation database, processing only newly added data, thereby avoiding point cloud reprocessing, a common limitation in previous methods.In our use case, our framework analyzed a public building based on three scans: a drone LiDAR capture of the exterior, a static LiDAR capture of a room, and a user-held RGB-D camera capture of a section of the room. Our approach provided a fast understanding of building volumes, room elements, and a fully detailed geometry of a requested object, a “panel with good lighting and a view to a nearby building”, to implement an XR activity.Our preliminary results are promising for applications in other urban and architectural contexts and point to further developments in our Geometric Data Inference AI as a cornerstone for deeper, more accurate Multi-AI integration.