Traditionally, water quality is assessed using costly laboratory and statistical methods, rendering real-time monitoring useless. Poor water quality requires a more practical and cost-effective solution. The machine learning classification approach appears promising for rapid detection and prediction of water quality. Machine learning has been used successfully to predict water quality. However, research on machine learning for water quality index (WQI) prediction is generally lacking. Therefore, this research aims to identify the important features for the WQI, which necessitated the classification of numerous indicators. This study develops four machine learning models (Artificial Neural Network, Support Vector Machine, Random Forest, and Naïve Bayes) based on the WQI and chemical parameters. The Langat Basin in Selangor dataset from the Department of Environment of Malaysia trains and validates each machine learning model. Several data preprocessing tasks such as data cleaning and feature selection have been conducted on the raw dataset to ensure the quality of the training data. The performance of these machine learning algorithms is further rectified based on the selected features set by several feature selection strategies such as information gain, correlation, and symmetrical uncertainty. Each classifier is then optimized using different tuning parameters to achieve optimum values before comparing the output of the three classifiers against each other. The observational results have shown that the optimized Random Forest classifier with the WQI parameter selected by the information gain feature selection method achieved the highest performance. The experimental results show that the WQI parameters are more relevant in predicting the WQI than the other variables. Consequently, this result shows that parameter oxygen (DO) and biochemical oxygen demand (BOD) are important features for predicting WQI. The proposed model achieved reasonable accuracy with minimal parameters, indicating that it could be used in real-time water quality detection systems.