This work integrates visual and physical constraints to perform real-time depth-only tracking of articulated objects, with a focus on tracking a robot's manipulators and manipulation targets in realistic scenarios. As such, we extend DART, an existing visual articulated object tracker, to additionally avoid interpenetration of multiple interacting objects, and to make use of contact information collected via torque sensors or touch sensors. To achieve greater stability, the tracker uses a switching model to detect when an object is stationary relative to the table or relative to the palm and then uses information from multiple frames to converge to an accurate and stable estimate. Deviation from stable states is detected in order to remain robust to failed grasps and dropped objects. The tracker is integrated into a shared autonomy system in which it provides state estimates used by a grasp planner and the controller of two anthropomorphic hands. We demonstrate the advantages and performance of the tracking system in simulation and on a real robot. Qualitative results are also provided for a number of challenging manipulations that are made possible by the speed, accuracy, and stability of the tracking system.