Accurate load forecasting plays a crucial role in the effective energy management of smart cities. However, the smart cities’ residents’ load profile is nonlinear, having high volatility, uncertainty, and randomness. Forecasting such nonlinear profiles requires accurate and stable prediction models. On this note, a prediction model has been developed by combining feature preprocessing, a multilayer perceptron, and a genetic wind-driven optimization algorithm, namely FPP-MLP-GWDO. The developed hybrid model has three parts: (i) feature preprocessing (FPP), (ii) a multilayer perceptron (MLP), and (iii) a genetic wind-driven optimization (GWDO) algorithm. The MLP is the key part of the developed model, which uses a multivariate autoregressive algorithm and rectified linear unit (ReLU) for network training. The developed hybrid model known as FPP-MLP-GWDO is evaluated using Dayton Ohio grid load data regarding aspects of accuracy (the mean absolute percentage error (MAPE), Theil’s inequality coefficient (TIC), and the correlation coefficient (CC)) and convergence speed (computational time (CT) and convergence rate (CR)). The findings endorsed the validity and applicability of the developed model compared to other literature models such as the feature selection–support vector machine–modified enhanced differential evolution (FS-SVM-mEDE) model, the feature selection–artificial neural network (FS-ANN) model, the support vector machine–differential evolution algorithm (SVM-DEA) model, and the autoregressive (AR) model regarding aspects of accuracy and convergence speed. The findings confirm that the developed FPP-MLP-GWDO model achieved an accuracy of 98.9%, thus surpassing benchmark models such as the FS-ANN (96.5%), FS-SVM-mEDE (97.9%), SVM-DEA (97.5%), and AR (95.7%). Furthermore, the FPP-MLP-GWDO significantly reduced the CT (299s) compared to the FS-SVM-mEDE (350s), SVM-DEA (240s), FS-ANN (159s), and AR (132s) models.