Short-Term Load Prediction (STLP) is an important part of energy planning. STLP is based on the analysis of historical data such as outdoor temperature, heat load, heat consumer configuration, and the seasons. This research aims to forecast heat consumption during the winter heating season. By preprocessing and analyzing the data, we can determine the patterns in the data. The results of the data analysis make it possible to form learning algorithms for an artificial neural network (ANN). The biggest disadvantage of an ANN is the lack of precise guidelines for architectural design. Another disadvantage is the presence of false information in the analyzed training data. False information is the result of errors in measuring, collecting, and transferring data. Usually, trial error techniques are used to determine the number of hidden nodes. To compare prediction accuracy, several models have been proposed, including a conventional ANN and a wavelet ANN. In this research, the influence of different learning algorithms was also examined. The main differences were the training time and number of epochs. To improve the quality of the raw data and remove false information, the research uses the technology of normalizing raw data. The basis of normalization was the technology of the Z-score of the data and determination of the energy‒entropy ratio. The purpose of this research was to compare the accuracy of various data processing and neural network training algorithms suitable for use in data-driven (black box) modeling. For this research, we used a software application created in the MATLAB environment. The app uses wavelet transforms to compare different heat demand prediction methods. The use of several wavelet transforms for various wavelet functions in the research allowed us to determine the best algorithm and method for predicting heat production. The results of the research show the need to normalize the raw data using wavelet transforms. The sequence of steps involves following milestones: normalization of initial data, wavelet analysis employing quantitative criteria (energy, entropy, and energy‒entropy ratio), optimization of ANN training with information energy–entropy ratio, ANN training with different training algorithms, and evaluation of obtained outputs using statistical methods. The developed application can serve as a control tool for dispatchers during planning.