Purpose: This study describes the trends and applications of machine learning systems in the management of water supply networks. Machine learning is a field in constant development, and it has a great potential and capability to attain improvements in real industries. The recent tendency of data storage by companies that manage the water supply networks have created a range of possibilities to apply machine learning. One particular case is the prediction of pipe failures based on historical data, which can help to optimally plan the renovation and maintenance tasks. The objective of this work is to define the stages and main characteristics of machine learning systems, focusing on supervised learning methods. Additionally, singularities that are usually found in data from water supply networks are highlighted.Design/methodology/approach: For this purpose, eight studies which contain real cases from around the world are discussed. From the data processing to the model validation, a tour of the methods used in each study is carried out. Moreover, the trendiest models are briefly defined together with the mechanisms that best suit their performance.Findings: As a result of the study, it was found that the imbalanced class problem is typical of data from water supply networks where only a small percentage of pipes fail. Consequently, it is recommended to use sampling methods to train classifiers, however, it is not necessary if we are training a regression system. Additionally, scaling and transformation of variables has generally a positive impact on the model’s performance. Currently, cross-validation is almost a requirement to obtain reliable and representative results. This technique is employed in all revised studies to train and validate their models.Originality/value: The use of machine learning systems to predict pipe failures in water supply networks is still a developing field. This study tries to define the advantages and disadvantages of different methods to process data from water supply networks, as well as to train and validate the models.
The water supply networks of many countries are experiencing a drastic increase in the number of pipe failures. To reverse this tendency, it is essential to optimise the replacement plans of pipes. For this reason, companies demand pioneering techniques to predict which pipes are more prone to fail. In this study, an Artificial Neural Network (ANN) is designed to classify pipes according to their predisposition to fail based on physical and operational input variables. In addition, the usefulness and effectiveness of two sampling methods, under-sampling and over-sampling, are analysed. The implementation of the model is done using the open-source software Weka, which is specialised in machine-learning algorithms. The system is tested with a database from a real water network in Spain, obtaining high-accurate results. It is verified that the balance of the training set is imperative to increase the predictions’ accurateness. Furthermore, under-sampling prioritises true positive rates, whereas over-sampling makes the system learn to predict failures and non-failures with the same precision.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.