Preceding vehicles have a significant impact on the safety of the vehicle, whether or not it has the same driving direction as an ego-vehicle. Reliable trajectory prediction of preceding vehicles is crucial for making safer planning. In this paper, we propose a framework for trajectory prediction of preceding target vehicles in an urban scenario using multi-sensor fusion. First, the preceding target vehicles historical trajectory is acquired using LIDAR, camera, and combined inertial navigation system fusion in the dynamic scene. Next, the Savitzky–Golay filter is taken to smooth the vehicle trajectory. Then, two transformer-based networks are built to predict preceding target vehicles’ future trajectory, which are the traditional transformer and the cluster-based transformer. In a traditional transformer, preceding target vehicles trajectories are predicted using velocities in the X-axis and Y-axis. In the cluster-based transformer, the k-means algorithm and transformer are combined to predict trajectory in a high-dimensional space based on classification. Driving data from the real-world environment in Wuhan, China, are collected to train and validate the proposed preceding target vehicles trajectory prediction algorithm in the experiments. The result of the performance analysis confirms that the proposed two transformers methods can effectively predict the trajectory using multi-sensor fusion and cluster-based transformer method can achieve better performance than the traditional transformer.