In the context of Carbon Capture and Storage (CCS), monitoring the behaviour of CO
2
within subsurface reservoirs is pivotal for efficient and safe storage operations. This study proposes a new approach to enhance the accuracy of predicting CO
2
saturation maps directly from shot gathers, using a combination of deep learning (DL) and feature extraction methods. We employ a U-Net model, specifically tailored to solve regression tasks, and two-dimensional continuous wavelet transform (2D-CWT) to analyse shot gathers at different scales. We introduce a novel approach using multi-channel input data for the DL model, combining shot gathers and CWT images. The DL model was trained and tested using both single channel and multichannel input. The single channel datasets included shot gathers and CWT images at the first scale, while the multi-channel datasets integrated shot gathers with either two or four scales of CWT. We conducted a sensitivity analysis on the number of training epochs to compare the performance of the model with multichannel and single channel input. Additionally, a Transfer Learning approach was implemented to improve performance in noisy conditions by leveraging knowledge from a pre-trained model on noise-free images. Monte Carlo dropout was used to predict pixel-scale variability and provided a prediction variability with a standard deviation ranging from 0.019 to 0.194, enhancing the robustness of CO
2
saturation estimates. Our results suggest that combining feature extraction using 2D-CWT with the U-Net architecture improves the prediction performance of CO
2
saturation in a synthetic CCS model. Additionally, using multi-channel input instead of single channel is more efficient as it requires less training and prediction time to achieve comparable results. This confirms the effectiveness of 2D-CWT as a pre-processing technique, contributing valuable insights and advances in the field of data-driven geophysical inversion problems.