Human motion tracking could be viewed as a multi-target tracking problem towards numerous body joints. Inertial-measurement-unit-based human motion tracking technique stands out and has been widely used in body are network applications. However, it has been facing the tough problem of accumulative errors and drift. In this paper, we propose a multi-sensor hybrid method to solve this problem. Firstly, an inertial-measurement-unit and time-of-arrival fusion-based method is proposed to compensate the drift and accumulative errors caused by inertial sensors. Secondly, Cramér-Rao lower bound is derived in detail with consideration of both spatial and temporal related factors. Simulation results show that the proposed method in this paper has both spatial and temporal advantages, compared with traditional sole inertial or time-of-arrival-based tracking methods. Furthermore, proposed method is verified in 3D practical application scenarios. Compared with state-of-the-art algorithms, proposed fusion method shows better consistency and higher tracking accuracy, especially when moving direction changes. The proposed fusion method and comprehensive fundamental limits analysis conducted in this paper can provide a theoretical basis for further system design and algorithm analysis. Without the requirements of external anchors, the proposed method has good stability and high tracking accuracy, thus it is more suitable for wearable motion tracking applications.