“…Instead of learning the en-tire pipeline, scene coordinate regression methods learn the first stage of the pipeline in the structure-based approaches. Namely, either a random forest [4,12,13,20,30,32,33,50,57] or a neural network [3,5,6,7,9,10,11,27,28,30] is trained to directly predict 3D scene coordinates for the pixels and thus the 2D-3D correspondences are established. These methods do not explicitly rely on feature detection, description and matching, and are able to provide correspondences densely.…”