Search citation statements
Paper Sections
Citation Types
Year Published
Publication Types
Relationship
Authors
Journals
The degradation of water quality has become a critical concern worldwide, necessitating innovative approaches for monitoring and predicting water quality. This paper proposes an integrated framework that combines the Internet of Things (IoT) and machine learning paradigms for comprehensive water quality analysis and prediction. The IoT-enabled framework comprises four modules: sensing, coordinator, data processing, and decision. The IoT framework is equipped with temperature, pH, turbidity, and Total Dissolved Solids (TDS) sensors to collect the data from Rohri Canal, SBA, Pakistan. The acquired data is preprocessed and then analyzed using machine learning models to predict the Water Quality Index (WQI) and Water Quality Class (WQC). With this aim, we designed a machine learning-enabled framework for water quality analysis and prediction. Preprocessing steps such as data cleaning, normalization using the Z-score technique, correlation, and splitting are performed before applying machine learning models. Regression models: LSTM (Long Short-Term Memory), SVR (Support Vector Regression), MLP (Multilayer Perceptron) and NARNet (Nonlinear Autoregressive Network) are employed to predict the WQI, and classification models: SVM (Support Vector Machine), XGBoost (eXtreme Gradient Boosting), Decision Trees, and Random Forest are employed to predict the WQC. Before that, the Dataset used for evaluating machine learning models is split into two subsets: Dataset 1 and Dataset 2. Dataset 1 comprises 600 values for each parameter, while Dataset 2 includes the complete set of 6000 values for each parameter. This division enables comparison and evaluation of the models' performance. The results indicate that the MLP regression model has strong predictive performance with the lowest Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE) values, along with the highest Rsquared (0.93), indicating accurate and precise predictions. In contrast, the SVR model demonstrates weaker performance, evidenced by higher errors and a lower R-squared (0.73). Among classification algorithms, the Random Forest achieves the highest metrics: accuracy (0.91), precision (0.93), recall (0.92), and F1-score (0.91). It is also conceived that the machine learning models perform better when applied to datasets with smaller numbers of values compared to datasets with larger numbers of values. Moreover, comparisons with existing studies reveal this study's improved regression performance, with consistently lower errors and higher R-squared values. For classification, the Random Forest model outperforms others, with exceptional accuracy, precision, recall, and F1-score metrics. INDEX TERMSData collection, Environmental monitoring, Internet of Things (IoT), Machine learning, Water quality analysis, Water quality class (WQC), Water quality index (WQI).
The degradation of water quality has become a critical concern worldwide, necessitating innovative approaches for monitoring and predicting water quality. This paper proposes an integrated framework that combines the Internet of Things (IoT) and machine learning paradigms for comprehensive water quality analysis and prediction. The IoT-enabled framework comprises four modules: sensing, coordinator, data processing, and decision. The IoT framework is equipped with temperature, pH, turbidity, and Total Dissolved Solids (TDS) sensors to collect the data from Rohri Canal, SBA, Pakistan. The acquired data is preprocessed and then analyzed using machine learning models to predict the Water Quality Index (WQI) and Water Quality Class (WQC). With this aim, we designed a machine learning-enabled framework for water quality analysis and prediction. Preprocessing steps such as data cleaning, normalization using the Z-score technique, correlation, and splitting are performed before applying machine learning models. Regression models: LSTM (Long Short-Term Memory), SVR (Support Vector Regression), MLP (Multilayer Perceptron) and NARNet (Nonlinear Autoregressive Network) are employed to predict the WQI, and classification models: SVM (Support Vector Machine), XGBoost (eXtreme Gradient Boosting), Decision Trees, and Random Forest are employed to predict the WQC. Before that, the Dataset used for evaluating machine learning models is split into two subsets: Dataset 1 and Dataset 2. Dataset 1 comprises 600 values for each parameter, while Dataset 2 includes the complete set of 6000 values for each parameter. This division enables comparison and evaluation of the models' performance. The results indicate that the MLP regression model has strong predictive performance with the lowest Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE) values, along with the highest Rsquared (0.93), indicating accurate and precise predictions. In contrast, the SVR model demonstrates weaker performance, evidenced by higher errors and a lower R-squared (0.73). Among classification algorithms, the Random Forest achieves the highest metrics: accuracy (0.91), precision (0.93), recall (0.92), and F1-score (0.91). It is also conceived that the machine learning models perform better when applied to datasets with smaller numbers of values compared to datasets with larger numbers of values. Moreover, comparisons with existing studies reveal this study's improved regression performance, with consistently lower errors and higher R-squared values. For classification, the Random Forest model outperforms others, with exceptional accuracy, precision, recall, and F1-score metrics. INDEX TERMSData collection, Environmental monitoring, Internet of Things (IoT), Machine learning, Water quality analysis, Water quality class (WQC), Water quality index (WQI).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.