Background: Diffusion-weighted imaging (DWI), a pivotal component of multiparametric magnetic resonance imaging (mpMRI), plays a pivotal role in the detection, diagnosis, and evaluation of gastric cancer. Despite its potential, DWI is often marred by substantial anatomical distortions and sensitivity artifacts, which can hinder its practical utility. Presently, enhancing DWI’s image quality necessitates reliance on cutting-edge hardware and extended scanning durations. The development of a rapid technique that optimally balances shortened acquisition time with improved image quality would have substantial clinical relevance. Objectives: This study aims to construct and evaluate the unsupervised learning framework called attention dual contrast vision transformer cyclegan (ADCVCGAN) for enhancing image quality and reducing scanning time in gastric DWI. Methods: The ADCVCGAN framework, proposed in this study, employs high b-value DWI (b = 1200 s/mm2) as a reference for generating synthetic b-value DWI (s-DWI) from acquired lower b-value DWI (a-DWI, b = 800 s/mm2). Specifically, ADCVCGAN incorporates an attention mechanism CBAM module into the CycleGAN generator to enhance feature extraction from the input a-DWI in both the channel and spatial dimensions. Subsequently, a vision transformer module, based on the U-net framework, is introduced to refine detailed features, aiming to produce s-DWI with image quality comparable to that of b-DWI. Finally, images from the source domain are added as negative samples to the discriminator, encouraging the discriminator to steer the generator towards synthesizing images distant from the source domain in the latent space, with the goal of generating more realistic s-DWI. The image quality of the s-DWI is quantitatively assessed using metrics such as the peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), feature similarity index (FSIM), mean squared error (MSE), weighted peak signal-to-noise ratio (WPSNR), and weighted mean squared error (WMSE). Subjective evaluations of different DWI images were conducted using the Wilcoxon signed-rank test. The reproducibility and consistency of b-ADC and s-ADC, calculated from b-DWI and s-DWI, respectively, were assessed using the intraclass correlation coefficient (ICC). A statistical significance level of p < 0.05 was considered. Results: The s-DWI generated by the unsupervised learning framework ADCVCGAN scored significantly higher than a-DWI in quantitative metrics such as PSNR, SSIM, FSIM, MSE, WPSNR, and WMSE, with statistical significance (p < 0.001). This performance is comparable to the optimal level achieved by the latest synthetic algorithms. Subjective scores for lesion visibility, image anatomical details, image distortion, and overall image quality were significantly higher for s-DWI and b-DWI compared to a-DWI (p < 0.001). At the same time, there was no significant difference between the scores of s-DWI and b-DWI (p > 0.05). The consistency of b-ADC and s-ADC readings was comparable among different readers (ICC: b-ADC 0.87–0.90; s-ADC 0.88–0.89, respectively). The repeatability of b-ADC and s-ADC readings by the same reader was also comparable (Reader1 ICC: b-ADC 0.85–0.86, s-ADC 0.85–0.93; Reader2 ICC: b-ADC 0.86–0.87, s-ADC 0.89–0.92, respectively). Conclusions: ADCVCGAN shows excellent promise in generating gastric cancer DWI images. It effectively reduces scanning time, improves image quality, and ensures the authenticity of s-DWI images and their s-ADC values, thus providing a basis for assisting clinical decision making.