The presence of carbon capture and storage (CCS) projects is important due to the growing production of greenhouse gases, especially carbon dioxide (CO 2 ).Our target functions have been chosen because of the importance of CO 2 storage in CCS projects and the requirement for producing less water in projects requiring water production. As a proxy for reservoir simulations, support vector regression, artificial neural network (ANN), and multivariate adaptive regression spline (MARS) have been used. It was determined that MARS had higher accuracy based on examining these three data-driven models with the available field data. It was, however, very close to the accuracy of the ANN. MARS gave root mean square of error (RMSE), mean absolute error (MAE), and R 2 values of 2.78%, 1.95%, and 0.998, respectively, for predicting CO 2 storage values in test data, yet 3.73%, 3.53%, and 0.995 for blind data. The RMSE, MAE, and R 2 to evaluate the machine learning (ML) model for predicting water production were 3.58%, 2.81%, and 0.997, respectively, while the results for blind data were 3.94%, 2.83%, and 0.997, respectively. By reducing the amount of computational load and time taken to reproduce the simulation data by MARS, it will be beneficial for optimization. This study highlights the application of the ML approach to coupling with genetic algorithms and optimizing CO 2 storage and water production. With the proposed frameworks, preprocessing, feature selection, two-stage validation of the data-driven model, and optimizer are performed in the aquifer and a repository containing a series of optimal solutions is developed to be used in projects.