Assessing building energy consumption in urban neighborhoods at the early stages of urban planning assists decision-makers in developing detailed urban renewal plans and sustainable development strategies. At the city-level, the use of physical simulation-based urban building energy modeling (UBEM) is too costly, and data-driven approaches often are hampered by a lack of available building energy monitoring data. This paper combines a simulation-based approach with a data-driven approach, using UBEM to provide a dataset for machine learning and deploying the trained model for large-scale urban building energy consumption prediction. Firstly, we collected 18,789 neighborhoods containing 248,938 buildings in the Shanghai central area, of which 2,702 neighborhoods were used for UBEM. Simultaneously, building functions were defined by POI data and land use data. We used 14 impact factors related to land use and building morphology to define each neighborhood. Next, we compared the performance of six ensemble learning methods modeling impact factors with building energy consumption and used SHAP to explain the best model; we also filtered out the features that contributed the most to the model output to reduce the model complexity. Finally, the balanced regressor that had the best prediction accuracy with the minimum number of features was used to predict the remaining urban neighborhoods in the Shanghai central area. The results show that XGBoost achieves the best performance. The balanced regressor, constructed with the 9 most contributing features, predicted the building rooftop photovoltaics potential, total load, cooling load, and heating load with test set accuracies of 0.956, 0.674, 0.608, and 0.762, respectively. Our method offers an 85.5%-time advantage over traditional methods, with only a maximum of 22.75% of error.