Rivers are the main sources of freshwater supply for the world population. However, many economic activities contribute to river water pollution. River water quality can be monitored using various parameters, such as the pH level, dissolved oxygen, total suspended solids, and the chemical properties. Analyzing the trend and pattern of these parameters enables the prediction of the water quality so that proactive measures can be made by relevant authorities to prevent water pollution and predict the effectiveness of water restoration measures. Machine learning regression algorithms can be applied for this purpose. Here, eight machine learning regression techniques, including decision tree regression, linear regression, ridge, Lasso, support vector regression, random forest regression, extra tree regression, and the artificial neural network, are applied for the purpose of water quality index prediction. Historical data from Indian rivers are adopted for this study. The data refer to six water parameters. Twelve other features are then derived from the original six parameters. The performances of the models using different algorithms and sets of features are compared. The derived water quality rating scale features are identified to contribute toward the development of better regression models, while the linear regression and ridge offer the best performance. The best mean square error achieved is 0 and the correlation coefficient is 1.
Recently, the industry of healthcare started generating a large volume of datasets. If hospitals can employ the data, they could easily predict the outcomes and provide better treatments at early stages with low cost. Here, data analytics (DA) was used to make correct decisions through proper analysis and prediction. However, inappropriate data may lead to flawed analysis and thus yield unacceptable conclusions. Hence, transforming the improper data from the entire data set into useful data is essential. Machine learning (ML) technique was used to overcome the issues due to incomplete data. A new architecture, automatic missing value imputation (AMVI) was developed to predict missing values in the dataset, including data sampling and feature selection. Four prediction models (i.e., logistic regression, support vector machine (SVM), AdaBoost, and random forest algorithms) were selected from the well-known classification. The complete AMVI architecture performance was evaluated using a structured data set obtained from the UCI repository. Accuracy of around 90% was achieved. It was also confirmed from cross-validation that the trained ML model is suitable and not over-fitted. This trained model is developed based on the dataset, which is not dependent on a specific environment. It will train and obtain the outperformed model depending on the data available.
Smart-home systems achieved great popularity in the last decade as they increase the comfort and quality of life. Reduction of energy consumption became a very important desiderate in the context of the explosive technological development of modern society with a major impact on the future development of mankind. Moreover, due to the large amount of data available from smart meters installed in households. It makes leverage to able to find data abnormalities for better monitoring and forecasting. Detecting data anomalies helps in making a better decision for reducing energy usage wasted. In recent years, machine learning models are widely used for developing intelligent systems. Currently, researchers’ main focus is on developing supervised learning models for predicting anomalies. However, there are challenges to train models with unlabeled data indicating data anomaly or not. In this paper, abnormalities are detected in electricity usage using unsupervised learning and evaluated using Excess Mass. The unsupervised anomaly detection model is based on Gaussian Mixture Model (GMM) and Isolation Forest (iForest). The models are compared with Local Outlier Factor (LOF) and One-class support vector machine (OCSVM). The proposed framework is tested with actual electricity usage and temperature data obtained from Numenta Anomaly Benchmark (NAB), which contains normal and anomaly data in time series. Finally, it has been observed that the iForest out-performed as the detection model for the selected use case. The outcome showed that the iForest can quickly detect anomalies in electricity usage data with only a sequence of data without feature extraction. The proposed model is suitable for the Smart Home Energy Management System's practical requirement and can be implemented in various houses independently. The proposed system can also be extended with the various use cases having similar data types.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.