Since water supply association analysis plays an important role in attribution analysis of water supply fluctuation, how to carry out effective association analysis has become a critical problem. However, the current techniques and methods used for association analysis are not very effective because they are based on continuous data. In general, there is different degrees of monotone relationship between continuous data, which makes the analysis results easily affected by monotone relationship. The multicollinearity between continuous data distorts these analytical methods and may generate incorrect results. Meanwhile, we cannot know the association rules and value interval between features and water supply. Therefore, the lack of an effective analysis method hinders the water supply association analysis. Association rules and value interval of features obtained from association analysis are helpful to grasp cause of water supply fluctuation and know the fluctuation interval of water supply, so as to provide better support for water supply dispatching. But the association rules and value interval between features and water supply are not fully understood. In this study, a data mining method coupling kmeans clustering discretization and apriori algorithm was proposed. The kmeans was used for data discretization to obtain the one-hot encoding that can be recognized by apriori, and the discretization can also avoid the influence of monotone relationship and multicollinearity on analysis results. All the rules eventually need to be validated in order to filter out spurious rules. The results show that the method in this study is an effective association analysis method. The method can not only obtain the valid strong association rules between features and water supply, but also understand whether the association relationship between features and water supply is direct or indirect. Meanwhile, the method can also obtain value interval of features, the association degree between features and confidence probability of rules.
The fluctuation of water supply is affected by the living habits and population mobility, so the daily water supply is significantly non-stationarity, which presents a great challenge to the water demand prediction based on data-driven model. To solve this problem, the Hodrick-Prescott (HP) and wavelet transform (WT) time series decomposition methods, and ensemble learning (EL) were introduced, coupling model bidirectional long short term memory (BLSTM), seasonal autoregressive integrated moving average (SARIMA) and Gaussian radial basis function neural network (GRBFNN) were developed, and interval prediction was carried out based on student's t-test (T-test). This research method was applied to the daily water demand prediction in Shenzhen and cross-validation was performed. It is found that the decomposed subseries has obvious law, and WT is superior to HP decomposition method. However, the maximum decomposition level (MDL) of WT should not be set too high, otherwise the trend characteristics of subseries will be weakened. The results show that the potential characteristics and quantitative relationships of historical data can be learned accurately based on WT and coupling model. Although the corona virus disease 2019 (COVID-19) outbreak in 2020 caused a variation in water supply law, this variation is still within the interval prediction. The WT and coupling model satisfactorily predicted water demand and provided the lowest mean square error (0.17%), mean relative error (0.1) and mean absolute error (3.32%) and the highest Nash-Sutcliffe efficiency (97.21%) and correlation coefficient (0.99) in testing set.
In this study, a deep learning model based on zero-sum game (ZSG) was proposed for accurate water demand prediction. The ensemble learning was introduced to enhance the generalization ability of models, and the sliding average was designed to solve the non-stationarity problem of time series. To solve the problem that the deep learning model could not predict water supply fluctuations caused by emergencies, a hypothesis testing method combining Student's t-test and discrete wavelet transform was proposed to generate the envelope interval of the predicted values to carry out rolling revisions. The research methods were applied to Shenzhen, a megacity with extremely short water resources. The research results showed that the regular bidirectional models were superior to the unidirectional model, and the ZSG-based bidirectional models were superior to the regular bidirectional models. The bidirectional propagation was conducive to improving the generalization ability of the model, and ZSG could better guide the model to find the optimal solution. The fluctuations in water supply were mainly caused by the floating population, but the fluctuation was still within the envelope interval of the predicted values. The predicted values after rolling revisions were very close to the measured values.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.