Artificial Intelligence (Al) models are being produced and used to solve a variety of current and future business and technical problems. Therefore, AI model engineering processes, platforms, and products are acquiring special significance across industry verticals. For achieving deeper automation, the number of data features being used while generating highly promising and productive AI models is numerous, and hence the resulting AI models are bulky. Such heavyweight models consume a lot of computation, storage, networking, and energy resources. On the other side, increasingly, AI models are being deployed in IoT devices to ensure real-time knowledge discovery and dissemination. Real-time insights are of paramount importance in producing and releasing real-time and intelligent services and applications. Thus, edge intelligence through ondevice data processing has laid down a stimulating foundation for real-time intelligent enterprises and environments. With these emerging requirements, the focus turned towards unearthing competent and cognitive techniques for maximally compressing huge AI models without sacrificing AI model performance. Therefore, AI researchers have come up with a number of powerful optimization techniques and tools to optimize AI models. This paper is to dig deep and describe all kinds of model optimization at different levels and layers. Having learned the optimization methods, this work has highlighted the importance of having an enabling AI model optimization framework.