TimberSLAM (TSLAM) is an object-centered, tag-based visual self-localization and mapping (SLAM) system for monocular RGB cameras. It was specifically developed to support a robust and augmented reality pipeline for close-range, noisy, and cluttered fabrication sequences that involve woodworking operations, such as cutting, drilling, sawing, and screwing with multiple tools and end-effectors. By leveraging and combining multiple open-source projects, we obtain a functional pipeline that can map, three-dimensionally reconstruct, and finally provide a robust camera pose stream during fabrication time to overlay an execution model with its digital-twin model, even under close-range views, dynamic environments, and heavy scene obstructions. To benchmark the proposed navigation system under real fabrication scenarios, we produce a data set of 1344 closeups of different woodworking operations with multiple tools, tool heads, and varying parameters (e.g., tag layout and density). The evaluation campaign indicates that TSLAM is satisfyingly capable of detecting the camera’s millimeter position and subangular rotation during the majority of fabrication sequences. The reconstruction algorithm’s accuracy is also gauged and yields results that demonstrate its capacity to acquire shapes of timber beams with up to two preexisting joints. We have made the entire source code, evaluation pipeline, and data set open to the public for reproducibility and the benefit of the community.
Graphic abstract