a b s t r a c tIn this work, we present a computational framework for automatically generating kinematic models of planar mechanical linkages from raw images. The hallmark of our approach is a novel combination of supervised learning methods for detecting mechanical parts (e.g. joints, rigid bodies) with the optimizing power of a multiobjective evolutionary algorithm, which concurrently maximizes image consistency and mechanical feasibility. A rigorous set of experiments was conducted to systematically evaluate the performance of each phase in our framework, comparing various combinations of joint and body detection schemes and feasibility constraints. Precision-recall curves are used to assess object detection performance. For the optimization, in addition to standard accuracy measures such as top-N accuracy, we introduce a new performance metric called user effort ratio that quantifies the amount of user interaction required to correct an inaccurate optimization solution. Current state-of-the-art performance is achieved with (i) one (or a cascade of) support vector machines for joint detection, (ii) foreground extraction to reduce false positives, (iii) supervised body detection using normalized geodesic time, distance, and detected joint confidence, and (iv) feasibility constraints derived from graph theory. The proposed framework generalizes moderately well from textbook graphics to hand-drawn sketches, and user effort ratio results demonstrate the potential power of an interactive system in which simple user interactions complement computer recognition for fast kinematic modeling.