Task sharing between heterogeneous robots currently requires a priori capability knowledge, a shared communication protocol, or a centralized planner. However, in practice, when two robots are brought together, the effort required to construct shared action and structure models is significant. In this paper, we describe our approach to determining the kinematic model of a robot based purely on observation of unscripted movement. We describe construction of large-scale data simulating low-cost RGB-D camera output, and application of two different RNN-based methods to the learning problem. Our results suggest that this is an efficient and effective way to determine a robot's morphological structure without requiring communication or pre-existing knowledge of its capabilities.