Abstract.A method is proposed that visually estimates the 3D pose and endpoints of a thin cylindrical physical object, such as a wand, a baton, or a stylus, that is manipulated by a user. The method utilizes multiple synchronous images of the object to cover wide spatial ranges, increase accuracy and deal with occlusions. Experiments demonstrate that the method can be applied in real-time using modest and conventional hardware and that the outcome suits the purposes of employing the approach for human computer interaction.