Facies classification of image logs plays a vital role in reservoir characterization, especially in the heterogeneous and anisotropic carbonate formations of the Brazilian pre-salt region. Although manual classification remains the industry standard for handling the complexity and diversity of image logs, it has notable disadvantages of being time-consuming, labor-intensive, subjective, and non-repeatable. Recent advancements in machine learning offer promising solutions for automation and acceleration. However, previous attempts to train deep neural networks for facies identification have struggled to generalize to new data due to insufficient labeled data and the inherent intricacy of image logs. Additionally, human errors in manual labels further hinder the performance of trained models. To overcome these challenges, we propose adopting the state-of-the-art SwinV2-Unet to provide depthwise facies classification for Brazilian pre-salt acoustic image logs. The training process incorporates transfer learning to mitigate overfitting and confident learning to address label errors. Through a k-fold cross-validation experiment, with each fold spanning over 350 meters, we achieve an impressive macro F1 score of 0.90 for out-of-sample predictions. This significantly surpasses the previous model modified from the widely recognized U-Net, which provides a macro F1 score of 0.68. These findings highlight the effectiveness of the employed enhancements, including the adoption of an improved neural network and an enhanced training strategy. Moreover, our SwinV2-Unet enables highly efficient and accurate facies analysis of the complex yet informative image logs, significantly advancing our understanding of hydrocarbon reservoirs, saving human effort, and improving productivity.