The study of groundwater quality is typically conducted using water quality indices such as the Groundwater Quality Index (GQI) or the GroundWater Quality Index (GWQI). The indices are calculated using field data and a scoring system that uses ratios of the constituents to the prescribed standards and weights based on each constituent’s relative importance. The results obtained by this procedure suffer from inherent subjectivity, and consequently may have some conflicts between different water quality indices. An innovative feature drives this research to mitigate the conflicts in the results of GQI and GWQI by using the predictive power of artificial intelligence (AI) models and the integration of multiple water quality indicators into one representative index using the concept of data fusion through the catastrophe theory. This study employed a two-level AI modeling strategy. In Level 1, three indices were calculated: GQI, GWQI, and a data-fusion index based on four pollutants including manganese (Mn), arsenic (As), lead (Pb), and iron (Fe). Further data fusion was applied at Level 2 using supervised learning methods, including Mamdani fuzzy logic (MFL), support vector machine (SVM), artificial neural network (ANN), and random forest (RF), with calculated GQI and GWQI indices at Level 1 as inputs, and data-fused indices target values derived from Level 1 fusion as targets. We applied these methods to the Gulfepe-Zarinabad subbasin in northwest Iran. The results show that all AI models performed reasonably well, and the difference between models was negligible based on the root mean square errors (RMSE), and the coefficient of determination (r2) metrics. RF (r2 = 0.995 and RMSE = 0.006 in the test phase) and MFL (r = 0.921 and RMSE = 0.022 in the test phase) had the best and worst performances, respectively. The results indicate that AI models mitigate the conflicts between GQI and GWQI results. The method presented in this study can also be applied to modeling other aquifers.