A virtual rail train (VRT) is a multi-articulated vehicle as well as a novel public transportation system due to its low economic cost, environmental friendliness and high transit capacity. Equipped with all-wheel steering (AWS) and a tracking control method, the super long VRT can travel on urban roads easily. This paper proposed a tracking control approach using only interoceptive sensors with high scene adaptivity. The kinematic model was established first under reasonable assumptions when the sensor configuration was completed simultaneously. A hierarchical controller consists of a front axle controller and a rear axle controller. The former applies virtual axles theory to avoid motion interference. The latter generates a first-axle reference path with path segmentation and a data updating method to improve storage and computational efficiency. Then, a fast curvature matching rear axles control method is developed with an actuator time delay considered. Finally, the proposed approach is verified in a hardware in loop (HIL) simulation under various situations with predefined evaluation standards, which shows better tracking performance and applicability.