This paper discusses the use of conditional Generative Adversarial Networks
(GANs) to generate dense urban features in satellite images and evaluate their
effectiveness in semantic segmentation tasks. High-resolution true-color satellite
imagery of Mumbai, obtained from Pleiades-1A at a 0.5m resolution, is utilized
for the study. The proposed Multiple Discriminator pix2pix (MD-pix2pix) model,
which employs multiple discriminators and a modified training procedure, is introduced
to generate realistic satellite images. The performance of the MD-pix2pix
model is compared to the traditional pix2pix model using a mix dataset of real and
generated satellite images. The synthesized satellite images are assessed for their
effectiveness in semantic segmentation tasks using various CNN models, including
VGG16-UNet, MobileNetV2-UNet, and DeepLabV3+. The study aims to overcome
the limitations of existing datasets that do not include informal settlements,
such as slum areas, which are common in many cities. The results indicate that
the MD-pix2pix model produces more realistic satellite images with greater variability
in vegetation type, slum arrangements, and built-ups than the traditional
pix2pix model. The synthetically generated satellite images are also effective in
semantic segmentation tasks, with better segmentation accuracy achieved using
the MD-pix2pix generated images for computationally less expensive architectures
such as MobileNetV2-UNet. This study highlights the potential of GANs
to generate realistic satellite images for urban feature mapping and monitoring
applications.