Generative adversarial networks (GANs) have turned up as the most widely used approaches for creating realistic samples. They're the effective latent variable models for learning complex real distributions. However, despite their enormous success and popularity, the process of training GANs remains challenging and suffers from a number of failures. These failures include mode collapse where the generator generates the same set of output for different inputs which finally leads to loss of diversity; non-convergence because of the diverging and oscillatory behaviors of both generator and discriminator while being trained; and vanishing or exploding gradients due to which either no learning or extremely slow learning takes place. In the past years, a variety of strategies for stabilizing GAN training have been explored which includes modified architectures, loss functions, and other methods. The choice of loss function has been found to be the most crucial part of the GAN model because it influences the vanishing gradient and model collapse directly. Viewing these loss functions as divergence minimization has provided a rich avenue of development. All of these factors make GAN training inherently unstable, and this instability is difficult to comprehend mathematically. This paper intends to provide a thorough mathematical explanation of these divergence minimization functions. It illustrates in great detail the two variants of the loss functions of the original GAN, their optimization to kullback-leibler (KL) divergence and jensen-shannon (JS) divergence along with their shortcomings. It also describes the loss functions of the different loss function GAN variants that have been proposed to mitigate these shortcomings as well as their minimization. The original GAN and its loss function variants have also been implemented using the standard MNIST, Fashion-MNIST, and CIFAR-10 datasets.