Model-based image coding is potentially a powerful technique for compressing scenes dominated by the head-shoulder images, as in a videotelephone scene with a closeup of the human upper-body. In this paper we describe a Kalnian filtering-based technique to robustly recover 3-D structure and kinematics of the head and arm in view from optical flow. Because of its explicit modeling of measurement noise and modeling uncertainty, the extended Kalinan filter (EKF) has been shown to improve the robustness of shape-from-motion techniques. The robustness of the recursive estimation technique is further enhanced by 1) confining feature-tracking within the neighborhoods of skeleton sketch of the head and arm, 2) formulating measurement noise of the optical flow as a function of the optical flow's confidence measures, and 3) fusing into the recursive estimation a set of constraints which govern kinematic linkages of an articulated arm.