Indoor positioning has become an emerging research area because of huge commercial demands for location-based services in indoor environments. Channel State Information (CSI) as a fine-grained physical layer information has been recently proposed to achieve high positioning accuracy by using range-based methods, e.g., trilateration. In this work, we propose to fuse the CSI-based ranges and velocity estimated from inertial sensors by an enhanced particle filter to achieve highly accurate tracking. The algorithm relies on some enhanced ranging methods and further mitigates the remaining ranging errors by a weighting technique. Additionally, we provide an efficient method to estimate the velocity based on inertial sensors. The algorithms are designed in a network-based system, which uses rather cheap commercial devices as anchor nodes. We evaluate our system in a complex environment along three different moving paths. Our proposed tracking method can achieve 1.3m for mean accuracy and 2.2m for 90% accuracy, which is more accurate and stable than pedestrian dead reckoning and range-based positioning.