The degradation of visual quality in remote sensing images caused by haze presents significant challenges in interpreting and extracting essential information. To effectively mitigate the impact of haze on image quality, we propose an unsupervised generative adversarial network specifically designed for remote sensing image dehazing. This network includes two generators with identical structures and two discriminators with identical structures. One generator is focused on image dehazing, while the other generates images with added haze. The two discriminators are responsible for distinguishing whether an image is real or generated. The generator, employing an encoder–decoder architecture, is designed based on the proposed multi-scale feature-extraction modules and attention modules. The proposed multi-scale feature-extraction module, comprising three distinct branches, aims to extract features with varying receptive fields. Each branch comprises dilated convolutions and attention modules. The proposed attention module includes both channel and spatial attention components. It guides the feature-extraction network to emphasize haze and texture within the remote sensing image. For enhanced generator performance, a multi-scale discriminator is also designed with three branches. Furthermore, an improved loss function is introduced by incorporating color-constancy loss into the conventional loss framework. In comparison to state-of-the-art methods, the proposed approach achieves the highest peak signal-to-noise ratio and structural similarity index metrics. These results convincingly demonstrate the superior performance of the proposed method in effectively removing haze from remote sensing images.