Visual marker systems have become an ubiquitous tool to supply a reference frame onto otherwise general scenes. Throughout the last decades, a wide range of different approaches have emerged, each one endowed with different strengths and limitations. Some techniques adopt tags that are optimized to reach a high accuracy in the recovered camera pose, others are based on designs that aim to maximizing the detection speed or minimizing the effect of occlusion on the detection process. Most of them, however, employ a two step procedure where an initial homography estimation is used to translate the marker from the image plane to an orthonormal world where it is validated and recognized. With this paper, we present a general purpose fiducial marker system that allows to perform both steps directly in image-space. Specifically, by exploiting projective invariants such as collinearity and cross-ratios, we introduce a detection and recognition algorithm that is fast, accurate and moderately robust to occlusion. The overall performance of the system is evaluated in an extensive experimental section, where a comparison with a well-known baseline technique is presented.
I. . IntroductionA visual marker is an artificial object consistent with a known model that is placed into a scene in order to supply a reference frame. Currently, such artifacts are unavoidable whenever a high level of precision and repeatability in image-based measurement is required, as in the case of accurate camera pose estimation, 3D structure-frommotion or, more in general, any flavor of vision-driven dimensional assessment task. While in some scenarios approaches based on naturally occurring features have been shown to obtain satisfactory results, they still suffer from a couple of shortcomings that severely hinder their broader use. Specifically, the lack of a well known model limits their usefulness in pose estimation and, even when such a model can be inferred (for instance by using bundle adjustment) its accuracy heavily depends on the correctness of localization and matching steps. Moreover, the availability and distinctiveness of natural features is not guaranteed at all. Indeed the smooth surfaces found in most man-made objects can easily lead to scenes that are very poor in features. Finally, photometric inconsistencies due to reflective or translucent materials jeopardizes the repeatability of the detected points. For this reasons, it is not surprising that artificial fiducial tags continue to be widely used and are still an active research topic. Markers are generally designed to be easily detected and recognized in images produced by a pinhole-modeled camera. In this sense they make heavy use of the projective invariance properties of geometrical entities such as lines, planes and conics. One of the earliest property used is probably the invariance of ellipses with respect to projective transformation, specifically, ellipses, and in particular circles, appear as (different) ellipses under any projective transformation. This allows...