Quantifying human motor act starts with measuring and estimating kinematics and dynamics variables as accurately as possible. Monitoring human motion has a wide array of applications in functional rehabilitation, orthopaedics, sports, assistive robotics or industrial ergonomics. Today's motion capture systems usually refer to stereophotogrammetric systems and laboratory-grade force-plate that are accurate but also costly, require expert skills, and are not portable. Recently, the use of affordable sensors for human motion estimation, such as Inertial Measurement Unit or RGB-Depth camera(s), has been the subject of numerous studies. Despite their great potential to be used outside of the laboratory, these systems still suffer from limited accuracy, mainly due to inherent IMU drift and visual occlusions, and the joint kinematics and kinetics estimates are still difficult to be estimated. These drawbacks might explain why such systems are rarely used in common clinical applications or for in-home rehabilitation programs. In this context, this thesis deals with the development of a new affordable motion capture system capable of estimating accurately human 3D joint state. Unlike previous studies based on either visual or inertial sensors, the proposed approach consists in combining data from newly designed visual-inertial sensors. The system is also making use of new practical calibration methods, which do not require any external equipment while remaining very affordable. All sensors data are fused into a constrained extended Kalman filter that takes advantage of the biomechanics of the human body and of the investigated tasks to improve significantly joint state estimate. This is done by incorporating different types of constraints, such as joint limits, rigid-body and soft joint constraints, as well as modelling the temporal evolution of joint trajectories and/or sensors random bias. The system's ability to estimate accurate 3D joint kinematics has been validated through various case studies of daily life activities for upper-arm and treadmill gait. Two different prototypes with different sensors count and configurations have been investigated. Experiments conducted with several healthy subjects showed very satisfactory results when compared to a gold standard motion capture system. Overall, the average RMS difference between the two systems was below 4deg. This was also the case when a reduced number of sensors was used for gait analysis. This system was also used for the dynamics identification of a viii lower-limbs human-exoskeleton system. As a result, an error below 6% was observed when comparing estimated and measured external ground reaction force and moments. Finally, beyond these validations, a dynamics assessment framework has been proposed with the aim of selecting an optimal human-exoskeleton dynamic model that is the best trade-off between the accuracy of kinetic estimation, i.e., joint torque, and simplicity of modelling. To this end, the proposed framework consists in quantifying the independent con...