Hurtownie danych i eksploracja danych w prognozowaniu popytu na gaz i usługi magazynowania gazu W artykule zaprezentowano tendencje światowe dotyczące hurtowni danych i eksploracji danych w przemyśle naftowym i gazowniczym oraz potwierdzono możliwości wykorzystania narzędzi zaawansowanej eksploracji danych do prognozowania popytu na paliwo gazowe i usługi podziemnego magazynowania gazu na polskim rynku. W ramach testowania użyteczności oprogramowania do eksploracji danych, zbudowano model prognozujący odbiór gazu z polskich podziemnych magazynów gazu. Uzyskane prognozy mają wysoką trafność, a użycie kreatorów wbudowanych w oprogramowanie zminimalizowało nakłady pracy i pozwoliło na automatyzację procesu budowy modeli. Słowa kluczowe: eksploracja danych, przemysł gazowniczy, prognozowanie. Data warehouses and data mining in forecasting the demand for gas and gas storage services The article presents world trends regarding data warehousing and data mining in the oil and gas industry and confirmed the possibility of using advanced data mining tools for forecasting natural gas demand and underground gas storage services on the Polish market. As part of data mining software usability testing, we built a model forecasting gas withdrawal from Polish underground gas storage facilities. The achieved forecasts have high accuracy, and the use of wizards embedded in the software, minimized the workload and allowed for the automation of the model building process.
The paper presents contemporary trends in artificial intelligence and machine learning methods, which include, among others, artificial neural networks, decision trees, fuzzy logic systems and others. Computational intelligence methods are part of the field of research on artificial intelligence. Selected methods of computational intelligence were used to build medium-term monthly forecasts of natural gas demand for Poland. The accuracy of forecasts obtained using the artificial neural network and the decision tree with classical linear regression was compared based on historical data from a ten-year period. The explanatory variables were: gas consumption in other EU countries, average monthly temperature, industrial production, wages in the economy and the price of natural gas. Forecasting was carried out in five stages differing in the selection of the learning and testing sample, the use of data preprocessing and the elimination of some variables. For raw data and a random training set, the highest accuracy was achieved by linear regression. For the preprocessed data and the random learning set, the decision tree was the most accurate. The forecast obtained on the basis of the first eight years and tested on the last two was most accurately created by regression, but only slightly better than with the decision tree or neural network, regardless of data normalization and elimination of collinear variables. Machine learning methods showed good accuracy of monthly gas consumption forecasts, but nevertheless slightly gave way to classical linear regression, due to too narrow set of explanatory variables. Machine learning methods will be able to show higher effectiveness as the number of data increases and the set of potential explanatory variables is expanded. In the sea of data, machine learning methods are able to create prognostic models more effectively, without the analyst’s laborious involvement in data preparation and multi-stage analysis. They will also allow for the frequent updating of the form of prognostic models even after each addition of new data into the database.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.