The underestimation of fuel consumption impacts various aspects. In the vehicle market, manufacturers often advertise fuel economy for marketing. In fact, the fuel consumption reference value provided by the manufacturer is quite different from the real-world fuel consumption of the vehicles. The divergence between reference fuel consumption and real-world fuel consumption also has negative effect on the aspects of policy and environment. In order to effectively promote the sustainable development of transport, it is urged to recognize the real-world fuel consumption of vehicles. The gaps in previous studies includes small sample size, single data dimension, and lack of feature weight evaluation. To fill the research gap, in this study, we conduct a comparative analysis through building five regression models to forecast the real-world fuel consumption rate of light-duty gasoline vehicles in China based on big data from the perspectives of vehicle factors, environment factors, and driving behavior factors. Results show that the random forest regression model performs best among the five candidate models, with a mean absolute error of 0.630 L/100 km, a mean absolute percentage error of 7.5%, a mean squared error of 0.805, an R squared of 0.776, and a 10-fold cross-validation score of 0.791. Further, we capture the most important features affecting fuel consumption among the 25 factors from the above three perspectives. According to the relative weight of each factor in the most optimal model, the three most important factors are brake and accelerator habits, engine power, and the fuel economy consciousness of vehicle owners in sequence.