A Transient Storage Model (TSM), which considers the storage exchange process that induces an abnormal mixing phenomenon, has been widely used to analyze solute transport in natural rivers. The primary step in applying TSM is a calibration of four key parameters: flow zone dispersion coefficient (Kf), main flow zone area (Af), storage zone area (As), and storage exchange rate (α); by fitting the measured Breakthrough Curves (BTCs). In this study, to overcome the costly tracer tests necessary for parameter calibration, two dimensionless empirical models were derived to estimate TSM parameters, using multi-gene genetic programming (MGGP) and principal components regression (PCR). A total of 128 datasets with complete variables from 14 published papers were chosen from an extensive meta-analysis and were applied to derivations. The performance comparison revealed that the MGGP-based equations yielded superior prediction results. According to TSM analysis of field experiment data from Cheongmi Creek, South Korea, although all assessed empirical equations produced acceptable BTCs, the MGGP model was superior to the other models in parameter values. The predicted BTCs obtained by the empirical models in some highly complicated reaches were biased due to misprediction of Af. Sensitivity analyses of MGGP models showed that the sinuosity is the most influential factor in Kf, while Af, As, and α, are more sensitive to U/U*. This study proves that the MGGP-based model can be used for economic TSM analysis, thus providing an alternative option to direct calibration and the inverse modeling initial parameters.
To minimize the damage from contaminant accidents in rivers, early identification of the contaminant source is crucial. Thus, in this study, a framework combining Machine Learning (ML) and the Transient Storage zone Model (TSM) was developed to predict the spill location and mass of a contaminant source. The TSM model was employed to simulate non-Fickian Breakthrough Curves (BTCs), which entails relevant information of the contaminant source. Then, the ML models were used to identify the BTC features, characterized by 21 variables, to predict the spill location and mass. The proposed framework was applied to the Gam Creek, South Korea, in which two tracer tests were conducted. In this study, six ML methods were applied for the prediction of spill location and mass, while the most relevant BTC features were selected by Recursive Feature Elimination Cross-Validation (RFECV). Model applications to field data showed that the ensemble Decision tree models, Random Forest (RF) and Xgboost (XGB), were the most efficient and feasible in predicting the contaminant source.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.