Unpaired image translation with feature-level constraints presents significant challenges, including unstable network training and low diversity in generated tasks. This limitation is typically attributed to the following situations: 1. The generated images are overly simplistic, which fails to stimulate the network’s capacity for generating diverse and imaginative outputs. 2. The images produced are distorted, a direct consequence of unstable training conditions. To address this limitation, the unpaired image-to-image translation with diffusion adversarial network (UNDAN) is proposed. Specifically, our model consists of two modules: (1) Feature fusion module: In this module, one-dimensional SVD features are transformed into two-dimensional SVD features using the convolutional two-dimensionalization method, enhancing the diversity of the images generated by the network. (2) Network convergence module: In this module, the generator transitions from the U-net model to a superior diffusion model. This shift leverages the stability of the diffusion model to mitigate the mode collapse issues commonly associated with adversarial network training. In summary, the CycleGAN framework is utilized to achieve unpaired image translation through the application of cycle-consistent loss. Finally, the proposed network was verified from both qualitative and quantitative aspects. The experiments show that the method proposed can generate more realistic converted images.