Implementing multivariate predictive analysis to ascertain stream-water (SW) parameters including dissolved oxygen, specific conductance, discharge, water level, temperature, pH, and turbidity is crucial in the field of water resource management. This is especially important during a time of rapid climate change, where weather patterns are constantly changing, making it difficult to forecast these SW variables accurately for different water-related problems. Various numerical models based on physics are utilized to forecast the variables associated with surface water (SW). These models rely on numerous hydrologic parameters and require extensive laboratory investigation and calibration to minimize uncertainty. However, with the emergence of data-driven analysis and prediction methods, deep-learning algorithms have demonstrated satisfactory performance in handling sequential data. In this study, a comprehensive Exploratory Data Analysis (EDA) and feature engineering were conducted to prepare the dataset, ensuring optimal performance of the predictive model. A neural network regression model known as Long Short-Term Memory (LSTM) was trained using several years of daily data, enabling the prediction of SW variables up to one week in advance (referred to as lead time) with satisfactory accuracy. The model’s performance was evaluated by comparing the predicted data with observed data, analyzing the error distribution, and utilizing error matrices. Improved performance was achieved by increasing the number of epochs and fine-tuning hyperparameters. By applying proper feature engineering and optimization, this model can be adapted to other locations to facilitate univariate predictive analysis and potentially support the real-time prediction of SW variables.
Multivariate predictive analysis of the Stream-Water (SW) parameters (discharge, water level, temperature, dissolved oxygen, pH, turbidity, and specific conductance) is a pivotal task in the field of water resource management during the era of rapid climate change. The highly dynamic and evolving nature of the meteorological and climatic features have a significant impact on the temporal distribution of the SW variables in recent days making the SW variables forecasting even more complicated for diversified water-related issues. To predict the SW variables, various physics-based numerical models are used using numerous hydrologic parameters. Extensive lab-based investigation and calibration are required to reduce the uncertainty involved in those parameters. However, in the age of data-informed analysis and prediction, several deep learning algorithms showed satisfactory performance in dealing with sequential data. In this research, a comprehensive Explorative Data Analysis (EDA) and feature engineering were performed to prepare the dataset to obtain the best performance of the predictive model. Long Short-Term Memory (LSTM) neural network regression model is trained using over several years of daily data to predict the SW variables up to one week ahead of time (lead time) with satisfactory performance. The performance of the proposed model is found highly adequate through the comparison of the predicted data with the observed data, visualization of the distribution of the errors, and a set of error matrices. Higher performance is achieved through the increase in the number of epochs and hyperparameter tuning. This model can be transferred to other locations with proper feature engineering and optimization to perform univariate predictive analysis and potentially be used to perform real-time SW variables prediction.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.