The Gray Wolf Optimizer (GWO) is a population-based meta-heuristic algorithm that belongs to the family of swarm intelligence algorithms inspired by the social behavior of gray wolves, in particular the social hierarchy and hunting mechanism. Because of its simplicity, flexibility, and few parameters to be tuned, it has been applied to a wide range of optimization problems. And yet it has some disadvantages, such as poor exploration skills, stagnation at local optima, and slow convergence speed. Therefore, different variants of GWO have been proposed and developed to address these disadvantages. In this article, some literature, especially from the last five years, has been reviewed and summarized by well-known publishers. First, the inspiration and the mathematical model of GWO were explained. Subsequently, the improved GWO variants were divided into four categories and discussed. After that, each variant's methodology and experiments were explained and clarified. The study ends with a summary conclusion of the main foundation of GWO and suggests some possible future directions that can be explored further.