Variational Auto-Encoders (VAEs) are deep latent space generative models which have been immensely successful in many applications such as image generation, image captioning, protein design, mutation prediction, and language models among others. The fundamental idea in VAEs is to learn the distribution of data in such a way that new meaningful data can be generated from the encoded distribution. This concept has led to tremendous research and variations in the design of VAEs in the last few years creating a field of its own, referred to as unsupervised representation learning. This paper provides a muchneeded comprehensive evaluation of the variations of the VAEs based on their end goals and resulting architectures. It further provides intuition as well as mathematical formulation and quantitative results of each popular variation, presents a concise comparison of these variations, and concludes with challenges and future opportunities for research in VAEs.